D A P H N E D - DATA versie 1.0 A - ACQUISITION oktober 1985 P - PROGRAMS door Jim Groeneveld H - HANDLING Schoolweg 14 N - NUMERICAL 8071 BC Nunspeet E - ENTRIES 0341 260 413 SAMENVATTING DAPHNE is een computerbibliotheek met (hoofdzakelijk) interactieve gebruikersvriendelijke programmatuur ten behoeve van gegevensinvoer, -wijziging en -bestudering. Numerieke gegevens worden door DAPHNE opgevraagd c.q. weergegeven op aparte regels voorafgegaan door hun (door de gebruiker vooraf gedefinieerde) alfanumerieke omschrijving (bijv. SPSS VAR LABELS). Daarvan maakt DAPHNE rechthoekige datamatrices van maximaal 5000 variabelen en een onbeperkt aantal cases. De gegevens worden als waarden geschreven (of gelezen) in een formaat naar keuze, waaronder a. FTN-formatted (=SPSS fixed), b. FTN-list directed (=SPSS freefield) en c. FTN-buffer i/o (=SPSS binary). Ponsdocumenten zijn hierdoor overbodig en zelfs onbruikbaar. Gegevens worden direct vanaf het originele medium op overzichtelijke wijze naar de computer overgebracht. De kans op fouten in een databestand is daardoor klein. Optioneel (maar zeer aan te bevelen) maakt DAPHNE gebruik van een geheugen (direct access) file, waarop alle gegevens van de in behandeling zijnde case uit het centrale geheugen worden bijgehouden en waarin bovendien alle ingestelde parameters, het (eventuele) fixed format en de variabele labels zijn opgenomen. Aldus is het mogelijk middenin een case (die nog niet is weggeschreven naar het sequenti‰le databestand) het invoeren of het wijzigen te onderbreken. Daarmee is DAPHNE ook storingsongevoelig: bij een eventuele storing is niets van wat is ingevoerd verloren. DAPHNE is te beschouwen als een soort editor, niet voor alfanumerieke tekst, maar voor numerieke waarden. Een veelheid van commando's ten behoeve van weergave en wijziging van gegevens is beschikbaar evenals duidelijke HELP-menu's. DAPHNE is operationeel op de Cyber-computers van het AAC-TNO te Den Haag. De communicatie met haar is in het Engels. De belangrijkste programma's zijn: ASKDATA voor het cre‰ren van een nieuw databestand of het uitbreiden van een bestaand bestand. REVISE voor het wijzigen in een databestand door het al wijzigend te copi‰ren naar een nieuw bestand. REVIEW voor het verkrijgen van een complete listing (op papier) van een bestaand bestand in dezelfde gebruikersvriendelijke vorm als op het beeldscherm (‚‚n gegeven per regel, voorafgegaan door z'n variabele label, maximaal 50 gegevens per pagina, een nieuwe case op een nieuwe pagina). Produceert relatief veel uitvoer! D A P H N E D - DATA version 1.0 A - ACQUISITION October 1985 P - PROGRAMS by Jim Groeneveld H - HANDLING Schoolweg 14 N - NUMERICAL 8071 BC Nunspeet E - ENTRIES 0341 260 413 SUMMARY DAPHNE is a computer library with (mainly) interactive user friendly programs for data entry, revision and examination. Numerical data are asked or displayed by DAPHNE on separate lines preceded by their (user defined) alphanumerical description (e.g. SPSS VAR LABELS). DAPHNE converts them into rectangular datamatrices with a maximum of 5000 variables and an unlimited number of cases. The data are written (or read) as values with a format as desired, such as a. FTN-formatted (=SPSS fixed), b. FTN-list directed (=SPSS freefield) and c. FTN-buffer i/o (=SPSS binary). Punch documents are redundant and even useless in this way. Data are transferred directly from the original medium into the computer in a conveniently arranged way. Thereby the risk of errors in a database is small. Optionally (but recommended very strongly) DAPHNE uses a memory (direct access) file, on which all data from the central memory of the case being handled are kept and which in addition remembers all parameters installed, the (eventual) fixed format and variable labels. Thus it is possible to interrupt data entry or revision in the middle of a case (which is not yet written to the sequential database). This implies insensitivity to disturbances of DAPHNE: if any disturbance occurs nothing entered is lost. DAPHNE may be regarded as a kind of an editor, not for alphanumerical text, but for numerical values. A large number of commands is available for displaying and revising data as well as suitable HELP menus. DAPHNE is operational on the Cyber computers at the AAC-TNO in The Hague. Communicating with her takes place in the English language. The most important programs are: ASKDATA for creating a new database or extending an existing one. REVISE for changing a database by editing and copying to a new one. REVIEW for obtaining a complete listing (on paper) of an existing database in the same user friendly form as on the screen (one datum per line, preceded by its variable label, a maximum of 50 data per page, each case starting on a new page). Produces relatively much output! D A P H N E 1. PURPOSE DAPHNE is an application package designed for the construction of (large) databases. She consists of a computer library with (mainly) interactive user friendly programs supporting data entry, revision and examination. The user is requested to enter values on the same screen line as their descriptions are displayed, which serve as questions to be answered. DAPHNE produces (and reads) rectangular data matrices of (real) numerical values. Cases (rows) may contain up to 5000 values (variables, columns). The number of cases is unlimited. DAPHNE may be viewed as some kind of an editor, not for alphanumerical text, but for numerical values. A large number of commands is available for displaying and revising data as well as suitable HELP menus for learning these commands and carrying them out step by step. 2. LABELS DAPHNE (optionally) expects up to 5000 variable labels, each label (up to 60 characters) on a single line, available on a separate (local or permanent) file. These labels are presented on the screen when entering values or when values are displayed by DAPHNE. This link between labels and values (via variable numbers) is one of the main features of DAPHNE. The labels may consist of any text created by the user outside DAPHNE for example by aid of a text editor. They may contain any meaningful comment, such as questions, descriptions etc.. When entering data the labels are displayed, one at a time, and the user simply has to enter the corresponding values beside them, that is, answering one's own questions, after which the next label is displayed and so on. Instead of creating labels by the user one may also use SPSS VAR LABELS which are already contained within SPSS input lines or which are written by SPSS using WRITE FILEINFO. These labels can be extracted from SPSS statements by applying the program SPSSLAB described later. 3. FORMATS DAPHNE writes (and reads) data using a selection of four different i/o types: 1. unformatted, 2. SPSS-binary (=buffer i/o), 3. freefield (=list directed) and 4. fixed. Any fixed (FORTRAN) format, concerning real values (enclosed within parentheses and using only F, E, and G descriptors), should be available on another separate (local or permanent) file, created outside DAPHNE. The traditional punch documents are redundant when using DAPHNE and may be skipped. Data are transferred directly from the original medium into the computer in a conveniently arranged way. The values are entered on the keyboard and displayed on the screen in a freefield format, thus without bothering about field widths and decimal places. Thereby the risk of errors in a database is small. The user has to take care that the format defined matches the data. Neither this correspondence, nor the syntax of the format will be checked. Make sure that the decimal point is included in the defined field widths of the real values, also if they only consist of whole numbers. Accidentally faulty values (being too large) may exceed field widths predetermined by a certain fixed format. This causes asterisks to be written to the database instead of the values. Thus owing to an accidentally unmatched fixed format, syntax errors in the format or faulty values unrecoverable errors may occur in the database. Therefore with initial data entry or revision it is recommended to write the data in a binary format to be sure that they are recoverable as (eventually faulty) values when stored. After completing the creation or revision of the database it may be reformatted using the desired fixed format with the program REFORM described later. Any errors in the format are non-fatal then, because the original database still exists in binary form. 4. MEMORY DAPHNE stores all user choices and maintains the values of the case being currently in memory on an optional (local or permanent direct access) memory file, which remains accessible and interpretable after interruption by the user or computer breakdown. The next time when DAPHNE is called the memory file enables the user to proceed from the point at which the interruption has occurred as if nothing has happened meanwhile. Besides, DAPHNE operates faster using this memory file than without using it. After entering, checking and (necessary) correcting a complete case it may be written from the central memory to the selected database with the format desired. 5. USE DAPHNE is accessible on the Cyber computers at the AAC-TNO in The Hague by entering: ATTACH,DAPHNE,ID=JIM. LIBRARY,DAPHNE. program[,optional parameters, explained in section 11]. "program" starts running DAPHNE and represents one of the following interactive (I), online (O) and batch (B) programs: (online here means running from the terminal, though the program is not interactive). ASKDATA (I) for creating a new database or adding to an existing one. REVISE (I) for revising an existing database, the revision will be written to a different (new) file, the original one is left unchanged and has to be deleted (purged) by the user consiously outside DAPHNE in the computer's operating system. EXAMINE (I) for examining an existing database meant for quick reviews and searches. No changes can be made. REFORM (IB) for reading an existing database with a certain chosen format and writing a copy of it to another file with a different format (see section 3). REVIEW (IB) for obtaining a complete listing of an existing database, case by case, one variable label and corresponding value per line, max. 50 lines per page, on a print file. Produces rather much output. SPSSLAB,bcdout,labels. (OB) for extracting suitable labels (one per line) from SPSS input or SPSS WRITE FILEINFO output. "bcdout" represents the file name of the SPSS card image file to serve as input for SPSSLAB. Initial VAR LABELS may be contained within continuating lines. "labels" represents the output file name in which one label is contained within one line, meant to be read by DAPHNE. BDCOUT and LABELS are the default file names. PREVIEW,parameters. (O) a CCL procedure (transfer or batch file) creating a batch job in which REVIEW is called. This is a better alternative for using REVIEW interactively, because REVIEW takes quite an amount of (computer) time to execute and when started online it would occupy the interactive terminal connection unnecessarily long. Besides, running REVIEW in a batch job usually is cheaper than running it interactively. PREVIEW necessarily needs parameters, which are: J,T,PAR,AC,TID,M,ID,O,R,L,REP,HEAD. meaning: J=jobname (max. 5 characters) T=time limit specification in octal CP seconds (not obliged, default: 77) PAR=any extra parameter in jobcard, e.g. ED...., ET....,P0,P1,etc. (not obliged, default empty) AC=accountnumber (within dollar signs) TID=terminal identification to send the review to (max. 2 characters, not obliged, default: C - central site) M=local (or permanent) file name of the applicable memory file created by ASKDATA or REVISE (not obliged if there is none) ID=permanent file identification of files to be used (max. 9 characters) O=local (or permanent) file name of database to be read (not obliged if M parameter has been given) R=format with which the database is to be read. Possibilities for "format" are: NOFORMAT or SPSSBINARY or FREEFIELD or local (or permanent) file name containing FORTRAN fixed format. (default: FREEFIELD) (not obliged if M parameter has been given) L=the number of variables in the database or the local (or permanent) file name of the file containing the labels to be used by DAPHNE (not obliged if M parameter has been given) REP=repetition factor for number of reviews. (not obliged, default: 0) HEAD=any comment (max. 40 characters between dollar signs) appearing as large capitals on the first page. (not obliged, default empty) Some of these parameters are necessary for the computer's operating system (NOS/BE). The user is referred to the appropriate manuals for their application. Other parameters are specifically necessary for DAPHNE and will be explained in detail later. The parameters may be entered using both the positional and the equivalence mode (explained in the CDC Cyber Control Language section of the NOS/BE reference manual). See also section 11. INFORM[,parameters]. (OB) for obtaining this text (printed on paper). Without parameters this text appears on the terminal screen as ASCII text, or when INFORM is called in a batch job this text is written on file OUTPUT and printed as DISPLAY text (capitals only). Optional parameters are: P,C,TID,FID,REP, meaning: P=any local file name on which this text is written C=ASC or DIS. Forces ASCII or DISPLAY coded text. (default: online ASCII, batch DISPLAY) TID=terminal identification to send the text to, e.g. TID=C sends text to central site for printing (only if P parameter is not used, max. 2 chars.) FID=file identification for the text if sent to TID (only if P parameter is not used, max. 5 chars.) REP=repetition factor for number of texts (default 0) HELP[,parameters]. (OB) for obtaining brief information on the screen how to obtain this INFORM document as decribed above. The parameters are the same as those for INFORM. All local file names (lfns) should meet the requirements for them demanded by the operating system, which in this case means a maximum of 7 letters or digits beginning with a letter. Though the demands for permanent file names (pfns) are broader, DAPHNE expects them to satisfy the same requirements as the lfns, because DAPHNE attaches them via local files (lfs, scratch files on disk, only existing during the interactive run or within a batch job) using the same names for both of them. If one wants to use different pfns one should attach the appropriate permanent files (pfs, on permanent disk devices) with valid lfns before DAPHNE is called. The information presented up to now already is sufficient to be able to apply DAPHNE if the user already is familiar with the current operating system of the Cyber computers. It is not necessary to know and explain precisely at this moment how to communicate with DAPHNE, because DAPHNE is equiped with suitable HELP menus, which guide the user through the programs efficiently. The user merely has to answer questions or select appropriate options. Invalid entries are indicated clearly as such together with a presentation of possible alternatives. Nevertheless it may be worthwhile to discuss some of DAPHNE's qualities. This is especially useful for many extra possibilities not taught by DAPHNE herself. Firstly a description will be given of what happens the first time the user starts running ASKDATA, which is partly the same when using REVISE, EXAMINE, REFORM or REVIEW. Next, all possible commands for editing data in ASKDATA and REVISE (and limited in EXAMINE) will be described. After that the optional parameters which may be added to the execute command of ASKDATA, REVISE, EXAMINE, REFORM and REVIEW will be discussed extensively. They enable changes in parameters already saved when the memory file was initialized. Finally DAPHNE's origin and development will be noted briefly. 6. RUN When executing ASKDATA for starting to create a new database DAPHNE asks you some questions in order to initialize file names and other parameters, which eventually are saved on a memory file. They are described below: 1. the -lfn- (local file name) of the memory file which is a disk file containing a copy of the contents of specific variables currently in memory storing names and values entered by the user. If the memory file does not yet exist it is created and initialized with all of the next parameters. If it already exists (when ASKDATA is called succesively) all of the next questions, except for the second one, are skipped. During data entry each time the value of a variable changes when a value is entered it is written to the memory file. If the user does not want to make use of this opportunity a space or NOMEMORY should be entered, in which case entries are only stored in the computer memory. A memory file can only be initialized, written and read by ASKDATA and REVISE. Memory files created by ASKDATA are not compatible with those created by REVISE. EXAMINE, REFORM and REVIEW can only read existing memory files created by both ASKDATA and REVISE. Do not specify a non-existent or invalid memory file when running EXAMINE, REFORM or REVIEW. DAPHNE tests for such situations, issues appropriate error messages and aborts herself. 2. the -id- (identification) of all -pfs- (permanent files) which is very much recommended to enter, for example the user's name. If one manipulates only with local files this parameter may be left blank. 3. the -lfn- of the data file which is created is the most essential one of all parameters. If it is left blank the default file name NEWDATA will be used. During creation and modification all values of a case are stored in the central memory (and on the memory file if used) and only after completing a case the user may choose between writing it to the data file or not. 4. the -lfn- of the label file. The number of labels (one per line) is counted and is the number of variables to be used, which may not exceed 5000. The labels themselves are not stored in the central memory. If no memory file is used, each time a label is displayed when entering values, it was read from the (sequential) label file. This may take some time if jumping through them. If, however, a memory file name was declared, the labels are copied onto that memory file and during data entry they are read from the memory file. The labels on the memory file are being accessed much quicker than those on the label file. Besides, running ASKDATA with an already existing memory file causes this question to be skipped: the label file is already known, the labels too, and they are not read again from the label file. If one has no labels on a label file one may enter the number of variables instead, or even if one does not know the exact number of variables one may enter NOLABELS (ASKDATA only), in which case after entering all values for the first case one enters END, thus defining the number of variables (see section 9). 5. the -lfn- of the format file containing the FORTRAN fixed format with which to write all (real) values of a completed case to the database. If there is no format file the alternatives are choosing between FREEFIELD format (which is default when entering a space), SPSSBINARY (FTN buffer i/o) and NOFORMAT (unformatted, the standard FORTRAN binary format). SPSS can read all these formats except for the last one. Any fixed format is read into the central memory and copied to the memory file if present. 6. the MISSING VALUE. This merely is a constant value which during data entry will be assigned automatically to variables being skipped, while there are no data for them. If one wants to use different missing values one should enter them during data entry as the value itself or redefine the missing value with the command MISSING. A third possibility is to define a temporary value when skipping (see MISSING and ASSIGN commands, section 8). The default missing value is 0. When REVISE, EXAMINE, REFORM and REVIEW are executed similar parameters are requested. The difference with those of ASKDATA is that there is no database to be written, but there is an old database to be read with an "old" format and, with REVISE and REFORM only, there is a new database to be written with a "new" format. The MISSING VALUE is only requested with ASKDATA and REVISE. After entering the initial parameters DAPHNE responds with a review of them and the user is offered the possibility to make amendments to it. However, parameters already saved in an existing memory file cannot be changed anymore (except for the missing value). If a memory file already exists and one wants to change one or more parameters this is only possible during the current execution of ASKDATA by adding them to the run command. In such an instance the specified parameter overrides the corresponding information already stored in the memory file. If it is a label file name, during the run its contents are not read from the memory file, but from the specified file instead (see section 11). So far, if the programs REFORM or REVIEW are executed, they proceed further without user intervention other than the system abort command (%A). All necessary files are read and processed immediately online. REFORM and REVIEW yield output files containing all the information of the (old) database, all cases and all variables, which may be printed (system ROUTE command) or serve as input for other (statistical) programs. If any database, to which data are added, already exists, DAPHNE counts the number of cases already contained in it, compares it with the same number remembered from the eventual memory file and displays both numbers. (It is no problem if they do not match: after writing a new case to the database the actual number of cases is restored in the memory file.) The database is positioned at the end-of-file, after which appending can take place (see also section 10). If the user leaves DAPHNE in the middle of a case (see STOP command, section 8) and a memory file is used, the next time when DAPHNE is started to continue data entry or revision the value preceding the current one is displayed, which is generally the last one entered formerly. The user then can proceed entering values starting with the current one. 7. VALUES After the initialization of parameters DAPHNE is ready to receive the values to be entered by the user and starts asking for them by displaying their variable numbers (and labels) subsequently. As already mentioned before, values are entered on the keyboard and displayed on the screen in a freefield format (FORTRAN list directed). Entering values with the E as well as the D notation is admitted. Internally they are stored in single precision real variables with approximately 14 significant decimal digits. The values should be within the range of 10^-293 to 10^+322. Instead of entering values in the form of digits the user also may specify them by way of specific entries with their meanings. Entering @ stands for the MISSING VALUE, entering VALUE (or V) stands for the value of the recent variable number (= the value displayed or entered lastly) and specifying VALUE n indicates the value of the variable with sequence number n. If n is negative it is interpreted as the variable number being the number of -n lower in the sequence of variables than the current variable number. Instead of defining n by a positive or negative number one may also use one of the terms representing certain variables described in the next section. 8. COMMANDS Instead of responding with a value the user may also enter one of the commands available. One of the commands the user may enter is HELP, which causes some of the main commands to be displayed on the screen. After choosing a command the user is requested for supplementary data stepwise, necessary for having the desired action performed. Many commands are available for listing and changing values, for jumping forwards and backwards in the requesting sequence of variables (numbers and labels) and for several other purposes. The most important commands are: LIST CORRECT ALTER REPEAT RESUME OMIT INSERT MISSING ASSIGN STOP HELP TEACH DO Most of these commands may be abbreviated by entering only their first letter, and most of them may be entered without supplementary data, because they are not needed or they are asked for. However, all of them may be entered including specific supplementary data in order to avoid an already known and boring stepwise request method. With most commands the further necessary information mainly consists of variable numbers. Variables are not accessed via their variable names and labels, which only appear on the screen as comment for the user, but via their sequence numbers. Other data which DAPHNE sometimes needs may be values. With a few commands instead of a variable number one may also enter a range: FROM number TO number. Similarly in a few instances instead of a value an infinite value range (to -I or +I or both) may be given. These ranges will be explained more detailed under the description of the appropriate commands later. Thus commands may be entered both with (direct) and without (stepwise) the necessary additions, while intermediate forms are also valid (semi-direct). With the last type the user is requested for the remaining information stepwise. If with either the direct or the stepwise method any of the entries being expected is invalid, the user is prompted for that entry only, enabling DAPHNE to respond or act according to the command as efficiently as possible. If using the stepwise method or if one is being prompted for a valid entry and one wants to cancel the command on second thoughts one simply may enter BREAK (or B) after which normal data entry resumes by displaying the current variable number (and label) again. Every last command, including its direct additions, is remembered and may be repeated by entering DO (or D). The command itself is displayed by entering DO DISPLAY (or D D). A list of available commands is presented by DAPHNE if TEACH is entered, explaining their syntax, or if HELP (or H) is requested firstly, followed by INFO, indicating their use (no abbreviation for INFO available). With all commands, which require variable numbers, the user may also enter negative numbers instead. These are not interpreted as absolute variable numbers but as relative ones in view of the current variable number displayed or already indicated previously within the same command. They represent deviations from the current variable number or indicate the total number of variables to be acted upon, depending on with which command they are used. The exact actions when using negative numbers are discussed in the descriptions of the appropriate commands later. Finally, instead of entering positive or negative variable numbers one may also indicate variables using the terms FIRST, LAST (L), END (E), CURRENT (C), PREVIOUS (P), NEXT (N) and RECENT (R). The CURRENT variable number is the one, the value of which is being asked currently. RECENT indicates the last variable number displayed with its (entered) value. PREVIOUS is RECENT minus 1 and NEXT is RECENT plus 1. END is LAST plus 1 and indicates the end-of-case state, which will be discussed later (see section 9). Almost all commands (if not discussed already) are described below in detail in their most direct form, including deviations from the direct form when using the stepwise method. In these descriptions variable numbers are represented by n, n1 and n2, while a value is denoted by v. Any value should satisfy the syntax rules explained in the former section.The current variable number is indicated by nc and the recent one by nr. The total number of variables is represented by nl. Optional parameters are enclosed within brackets. Choices are separated from each other by the broken vertical bar. L I S T LIST is used for displaying values of one or more variables or for searching variables with a specific value or a range of values and displaying them. Different syntax types are: LIST lists the value vr of the recent variable nr, which is mostly the current one minus one, being the last one entered. LIST n lists the value of variable number n (1ónónl). If n is negative the resulting variable number will be nc+n, that is that absolute number lower than the current one. LIST ALL [v] lists the values of all variables 1 (FIRST) to nc-1 (CURRENT minus one, not LAST!) within blocks of 20 at a time, continued or interrupted at the user's choice. Any optional value added to the command causes a search for that value within the same range of variables. Instead of only one value the user may also enter a range of values, the syntax of which will be described later with the ALTER command (see RANGE). LIST ALL TO [n [v]] lists the values of all variables 1 (FIRST) to n similar to the previous command (1ónónl). If a case has not yet been completed and one would desire to list all variables, one could simply enter: L A T L. A negative n indicates the absolute number of variables to be listed or searched, which is the same as a positive one in this case. LIST TO [n [v]] lists the values of all variables nc (CURRENT) to n quite similarly (ncónónl). If n is negative the resulting variable number thus is: nc-1-n, causing n(absolute) values to be displayed or searched. LIST FROM [n1 [TO [n2 [v]]] | [v]] lists the values of the variables n1 to n2 similarly (1ón1ón2ónl). If the optional TO is not included, the range of n1 to nc-1 is assumed (1ón1ónc-1). If n1 is negative the resulting starting variable number N1=nc+n1, causing DAPHNE to start listing or searching from n1(absolute) variables lower in the sequence than the current one. If n2 is negative the resulting closing variable number N2=N1-1-n2, which causes n2(absolute) values to be listed or searched. It is obvious that the stepwise communication method with the LIST command is hardly applicable. Only if entering faulty parts of the command or if entering for example no more than L F without specifying the variable number, DAPHNE will ask stepwise for additional information. LIST is valid within both ASKDATA and REVISE as well as within EXAMINE. When executing EXAMINE the current variable is always equal to the last plus one, the end-of-case state (see section 9). C O R R E C T CORRECT is used for changing the value of only one variable at a time. Its syntax is: CORRECT [n [v]], which defines the value of variable number n as v (1ónónc-1). If n is negative it is subtracted from the current variable number in order to access the resulting one: nc+n, like in the simple LIST command. The value v, however, may not indicate a range of values, but only one specific value, which is assigned to the specified variable number. CORRECT is only valid when executing ASKDATA or REVISE. A L T E R ALTER is a very powerful command for changing one or more values meeting certain criteria, within a range of variables. Possible syntax forms are: ALTER [FROM [n1 [TO [n2 [v]]]]] | [FROM|TO [n [v]]] | [v] The FROM and TO conventions, including those for negative numbers, are the same as with the LIST command, but the variable numbers are only valid within the limited range of 1ón1ón2ónc-1. If FROM and TO are not specified, their respective default numbers, defining the extremes of the range, are 1 and nc-1. The value v represents an old value or an old value range (within the specified variable range), that should be replaced by one or more new values. Clearly the new value cannot be given with the direct command method. DAPHNE always asks for one (or more) new value(s). Instead of entering an old value the user may choose a range of old values just by typing RANGE (or R). This causes information on the specification of value ranges to be presented. Instead of replying with RANGE one may also directly enter the value range specification, which is quite simple as follows: = [v] or EQ [v] is equivalent to entering only one single old value, which has to be replaced. < [v] or LT [v] means all values lower than the specified value. > [v] or GT [v] indicates all values greater than the specified value. <= [v] or LE [v] or =< [v] stands for all values lower than or equal to the specified value. >= [v] or GE [v] or => [v] searches for all values greater than or equal to the specified value. <> [v] or NE [v] or # [v] represents all values being unequal to the specified value; it causes acting upon all values except for the specified one. ALL causes DAPHNE to regard all values within the formerly specified variable range. These value range specifications (except ALL) are also valid with the LIST command. The new value may be defined as a single value, which will replace all old values (within the variable and value ranges specified). DAPHNE does not assign this value instantaneously to the concerning variables, but offers the user for each variable the possibility of entering a VETO response. That is, the user may decide variable by variable to perform the alteration or not. The VETO response is described in more detail with the ASSIGN command later. One can also switch it off in order to have all alterations carried out at once. On the other hand one might wish to assign different new values to variables with the specified old value(s). In such instances instead of entering one specific new value the user should reply with MORE (or M). Thereupon DAPHNE reacts by displaying the variable numbers (and labels) of the variables, of which the values should be altered, one by one, offering the user the opportunity to enter the new values next to them, just as during the normal data entry procedure. In this case instead of specifying a value one may also type KEEP (or K) in order to leave the old value of the concerning variable unchanged. Entering BREAK (or B) cancels the ALTER procedure at the point reached and causes resuming of normal data entry. ALTER is only valid within ASKDATA and REVISE. R E P E A T F R O M REPEAT is intended too for changing more than one value at a time. Its syntax, being REPEAT [n] (1ónónc-1), causes the current variable number to be reset to a lower number, earlier in the variable sequence. It repositions DAPHNE to ask for data entry from a previous variable number. A negative n causes a backward jump of -n variables. Its effect is that the user can change values from any variable number up to the current one. This "change" obviously is not accomplished within the REPEAT command, but is achieved during normal data entry, which will only be repeated for a part of the variables. REPEAT itself does not change anything, except for the current variable number, which is displayed immediately. The same effect may be obtained when entering, for example, the direct command sequence A F n A. REPEAT is only valid when running ASKDATA or REVISE. R E S U M E A T RESUME is meant for explicitely leaving values of variables, still to come, unchanged. Its syntax, RESUME [n] (nc+1ónónl+1), redefines the current variable number as a higher number in the variable sequence. It only causes a jump forwards, like REPEAT causes a jump backwards, without performing any further action than to force DAPHNE to resume normal data entry at a later point. If n=nl+1 (END, one more than the total number of variables), DAPHNE terminates the data entry procedure, because she has been positioned after the last variable. At this moment the end-of-case state procedure starts, which will be described later (see section 9). If n is negative it causes -n variables to be skipped, resuming data entry at the redefined current variable number nc-n. The sense of the RESUME command is the following: once data entry for a case has been completed (whether the data have been written to the database or not (see section 10)) and data for the next case are requested, the values of the former case remain available in the central memory (and the eventual memory file). They generally will be overridden by the values (or commands) applicable for the next case, but they can be "copied" to the immediately succeeding case by leaving them unchanged during the data entry for that case. Thus, the RESUME command is intended for copying values (of certain variables) from one case to another and so on. RESUME is only valid within ASKDATA and REVISE. When applying RESUME within REVISE it is possible to copy values between initially not directly succeeding cases by creating a new case sequence. This will be explained in detail later (see section 10). O M I T OMIT very simply provides for the deletion of a value (only one) by specifying a variable number below the current one to be deleted: OMIT [n] (1ónónc-1). If n is negative, the resulting variable number is n(abs) lower than the current one. The effect of the deletion is that all values of the subsequent variable numbers n+1 to nc-1 are moved one variable backwards, thus to the numbers n to nc-2. This is followed by a redefinition of the variable number, becoming the current one: Nc=nc-1, from which DAPHNE resumes normal data entry. OMIT is only valid when running ASKDATA or REVISE. I N S E R T INSERT quite logically takes care for the opposite action: it inserts a value for any variable below the current one: INSERT [n [v]] (1ónónc-1). A negative number has the same effect as with OMIT. Before assigning the specified value to the specified variable number, all values of the specified and subsequent variable numbers n to nc-1 are moved forwards to the numbers n+1 to nc. Finally the current variable number is redefined to nc+1, which, however, never can exceed the total number of variables plus one: nl+1, the end-of-case state (see section 9). If the end-of-case state already was reached before inserting, the value of the very last variable is lost. INSERT is only valid within ASKDATA and REVISE. M I S S I N G ( V A L U E ) MISSING is used to display or (re)define the current missing value. The missing value is not treated as missing by DAPHNE, but may be used as such with any statistical program. To DAPHNE it is merely a user defined constant value, which may be entered during data entry as @ and which is used with the ASSIGN command. MISSING is only valid when executing ASKDATA or REVISE. There are two different forms of syntax depending on whether one wants to merely display or just redefine the missing value: MISSING, without any addition, causes the current missing value to be displayed. MISSING v, including a specific value, should be entered for defining and displaying a new value as missing. A S S I G N ( M I S S I N G V A L U E P A S T ) ASSIGN (abbreviation AM, not A) may be applied when one wants to jump forwards over a series of variables quickly (similarly as with RESUME), but instead of leaving their values unchanged, having them replaced by a (missing) value. Though the syntax may not look quite logical, it is very convenient to use: ASSIGN [n [v]] (ncónónl) and resembles the syntax types of some other commands (CORRECT and INSERT). If n is negative it indicates the number of variables from the current one to be assigned the (missing) value and DAPHNE resumes asking for data again at the newly defined current variable number nc-n. If no value is specified, the current missing value will be assigned to all the skipped variables. If any value, whether or not different from the current missing value, is specified, that value is used instead, without changing the current missing value. If neither a variable number nor a value has been specified, DAPHNE prompts for both of them. Apart from the main difference between RESUME and ASSIGN (the assignment of values) there is a small difference in the variable number, which must be specified: RESUME needs a variable number to resume at, which is one higher than the variable number past which to ASSIGN values, though normal data entry resumes from the same variable number. ASSIGN is only valid if executing ASKDATA or REVISE. The commands ALTER and ASSIGN, both generally modifying more than one value, do not change automatically all values immediately. ASSIGN provides the user with the opportunity to choose between changing them all at once (YES), cancelling the ASSIGN command (NO, no changes made at all) or applying a VETO option. ALTER directly, when starting to change values, supplies the VETO option. Its effect is that changes are proposed one by one (one value at a time). At each displayed proposal the user may respond with YES, NO, AUTO or BREAK: YES means to perform the modification actually for this variable and to continue with the next value proposed. NO means not to accept the proposal, to leave the original value for this variable unchanged and to continue with the next value proposed. AUTO indicates to start at once with modifying this and all succeeding (appropriate) values automatically. It in fact switches off the VETO response. BREAK causes the data modification procedure to terminate at this point and to proceed with normal data entry or revision. S T O P STOP is to be applied if the user wants to suspend DAPHNE during data entry or during the end-of-case state, that is before writing the (not) completed case to the database (see section 9). It is only valid within ASKDATA or REVISE if a memory file is applied. Otherwise the command STOP is not accepted, because the values, which only exist in the central memory, are lost if they are not written to the database. DAPHNE prompts to STOP with: "ARE YOU SURE?", offering the user the opportunity to reconsider his decision. In order to suppress DAPHNE's friendly care one may enter directly the command sequence STOP SURE (or S S). 9. END The end-of-case state may be reached by: a. completing normal data entry for a case, having entered its final (last) value. b. entering the commands RESUME or ASSIGN during data entry and specifying all remaining variables to be skipped. c. INSERTing a value if there was still only one value left to be entered during normal data entry. d. entering the command END within ASKDATA after entering all values for the first case, when the number of variables initially was declared unknown (NOLABELS) (see section 6). e. having entered 5000 values for the first case within ASKDATA, while until then no number of variables was defined. The end-of-case state is indicated by DAPHNE by asking whether the user still wants to LIST OR CHANGE ANY VALUE (LIST only, if running EXAMINE). This is the most user friendly way for checking the values before definitively writing the case to the database. In contrast to expecting values during normal data entry, DAPHNE now expects a variable number given by the user, or for example the response YES, which causes DAPHNE to explicitely request for the variable number. After responding appropriately the specified variable number(, its label) and its value are displayed. Within ASKDATA and REVISE only, DAPHNE next asks whether to change the value or not. Thereupon the user may respond with YES or NO or the desired value immediately, which is redisplayed too. Then DAPHNE repeates the LIST OR CHANGE ANY VALUE question again. In this stage it is not permitted to use the special terms indicating specific variable numbers FIRST, LAST, etc.. That is because, except for YES or a number, one may also specify a command as described in the previous section and their abbreviations would intermix with those of the numbers. If applying the commands, however, the special terms for the numbers and their abbreviations are allowed again. The user should consider that the current variable number in this stage is the total number of variables plus one. There are no more variables to follow and therefore some of the commands are meaningless and illegal, such as RESUME and ASSIGN. Summarizing, the responses given to DAPHNE's question may be YES, a variable number, a valid command and the answer NO. NO closes the case, making further checks impossible, but the case still has not yet been written to the database. Now, within ASKDATA and REVISE only, the user may choose between writing the case to the data file or not. If one, on second thoughts, still wants to amend the values of the case being closed, one should choose not to write the case, after which all values in the central memory (and the eventual memory file) are available again during normal data entry or the end-of-case state for the "next" case, in fact being the same case. 10. RESTART Anyway, whether having choosen to write the case to the database or not, or after just having one of these programs (re)started, DAPHNE asks whether to ENTER (ASKDATA), REVISE (REVISE) or EXAMINE (EXAMINE) values of the old next case. If answering YES the data entry procedure is restarted, within ASKDATA at the beginning-of-case (variable number 1) and within REVISE and EXAMINE at the end-of-case. Within EXAMINE the user always is in the end-of-case state, within REVISE one generally starts in this state, but one can force normal data entry by the commands REPEAT and OMIT, modifying the current variable number. The user should also remember that in the beginning-of-case state (variable number 1) certain commands are illegal, such as CORRECT, ALTER, REPEAT and OMIT. If one, however, answers NO to DAPHNE's last question, it depends on the program which is executed what happens: within ASKDATA DAPHNE terminates the run (the program stops), but within REVISE and EXAMINE DAPHNE presents the user a menu with case handling commands, which are: for REVISE : ENTER REWIND(READ) COPY JUMP INSERT STOP for EXAMINE: ENTER REWIND(READ) JUMP GET STOP Instead of answering NO to DAPHNE's last question, it is also allowed to enter one of these commands directly. They may also be abbreviated with their first letter. Not all commands are applicable and presented at the same time, that depends on the positioning of DAPHNE in the old database. Some commands require additional case numbers, which are requested with the stepwise method. Instead, one may also apply the direct method and specify the case numbers in the same command sequence. In some instances the stepwise and the direct method yield different results: in those cases the additional case numbers are not obliged and are not requested by DAPHNE. Adding them, however, to the direct command forces slightly different actions. Like the variable numbers in section 8, case numbers may be specified using negative numbers too. They indicate (in absolute size) the amount of cases to be handled or a case number relative to the current one; whether lower or higher depends on the preceding command. Next these commands are discussed in more detail. Optional case numbers are represented by i, the current old case number after which DAPHNE is positioned, is denoted by ic, being initially 0, and the total number of cases in the old database is indicated by it. The current new case number is the number of cases already copied or written to the new database, which is always positioned at end-of-file. E N T E R ENTER causes DAPHNE to read the next case from the (old) database into the central memory (and the memory file) in order to manipulate its values, with the same result as when responding YES to the former question. If entering: ENTER i (ic+1óióit) the user forces DAPHNE to copy cases from the old database to the new one (REVISE), or to skip in the old database (EXAMINE), past case i-1 and to read old case i into the memory. With a negative i exactly -i-1 cases are copied (skipped) causing the succeeding old case to be read. The current old case number becomes i (if positive) or ic-i (if negative). Data revision or examination starts with the end-of-case state. R E W I N D ( R E A D ) REWIND causes the old database to be rewound, positioned before old case 1 and repeats the case selection procedure described in this section from the restart. READ i (1óióit) contrary to ENTER forces reading old case i into memory, without copying any cases to the new database (REVISE). Whether the old database needs to be rewound before read depends on the position of i in regard to ic. C O P Y ( P A S T ) COPY [i] (ic+1óióit) activates DAPHNE to copy old cases to the new database past old case i. If i is negative it indicates in the absolute sense the number of cases to be copied. J U M P ( P A S T ) JUMP [i] (ic+1óióit) is very similar to COPY, though does not copy cases, but only skips them (see further COPY). I N S E R T ( P A S T ) INSERT allows the user to insert a case at the current point (for adding to the new database). DAPHNE returns to the user in the normal data entry procedure at the beginning-of-case (the only time this occurs within REVISE). INSERT i (icóióit) firstly starts to copy cases from the old to the new database past old case i and then allows for the insertion of a case as before. A negative i causes -i cases to be copied prior to the insertion. (REVISE only) G E T GET transfers the data from the memory file (if present) to the central memory, because when using EXAMINE, the values contained within the memory file may differ from those in the data file (e.g. a still incomplete new case). (EXAMINE only) S T O P STOP logically commands DAPHNE to terminate the program immediately and so she does, with EXAMINE as well as with REVISE. However, if during REVISE not yet all old cases are skipped or copied (modified or not), that is, the old database is not yet positioned at end-of-file, DAPHNE prompts for copying the remaining cases or not before final termination. This prompt may be avoided, specifying: STOP COPY in order to tell DAPHNE to carry out the rest copy, or: STOP SURE for leaving the rest as it is and to terminate at once. Combinations and repetitive use of the just described commands, jumping and reading criss-cross through the old database, make it possible to read the old database in a specified sequence, thus changing the case sequence in the new database. It is not possible to manipulate the new database, it is positioned at end-of-file and once written it remains written and can only be manipulated itself by specifying it as the old database in a next run. The old and the new data files can only be deleted (destroyed, purged) outside DAPHNE within the operating system. This is intended in order not to destroy something valuable accidentally too easy. 11. PARAMETERS In sections 5 and 6 it was already mentioned that the run commands for DAPHNE's programs may be extended with parameters on the run command line. This feature offers the user the possibility to initialize all interactively requested parameters, discussed in section 5, at execution time and to specify different parameters when using an already existing memory file with its fixed parameters. Firstly a global review of all possible parameters, to be applied to the programs ASKDATA, REVISE, EXAMINE, REFORM and REVIEW is given, followed by a more extensive description of the effects of some parameters. For the meaning of the terms "lfn" and "id" and a description of the contents of the parameters, see sections 5 and 6. Any missing value which includes a decimal point (and decimals) or a minus sign should be specified within dollar signs (e.g. V=$-1$). Keywords and parameters DEFAULTS 1 2 3 4 5 | programs N=lfn of (new) database NEWDATA x x . x . | 1=ASKDATA W=write format specification FREEFIELD x x . x . | 2=REVISE L=label specification NOLABELS*) x x x x x | 3=EXAMINE M=memory file specification NOMEMORY x x x x x | 4=REFORM O=lfn of (old) database OLDDATA . x x x x | 5=REVIEW R=read format specification FREEFIELD . x x x x |*) applies P=lfn of review to print OUTPUT . . . . x |to ASKDATA ID=id of permanent files no id x x x x x |only (see V=missing value definition 0 x x . . . |section 6. DEFAULTS (see text below) - x x x x x |and below) The user should keep in mind that all file names, specified within the memory file or as additional parameters with the run command, must be different. There is only one exception: all format files may be the same one, if one wishes. Violating this condition while specifying the same file names as parameters with the run command, causes DAPHNE to abort. Violating it during interactively initializing DAPHNE, whether or not creating a new memory file, causes DAPHNE to respond appropriately without aborting. Any valid combination of parameters may be specified when starting one of the programs. The parameters should be separated from each other and from the program statement by comma's. They may be entered using both the positional mode (specifying only the parameters themselves) and the equivalence mode (specifying the sequence "keyword=parameter") as well as a combination of these modes as long as the positional mode precedes the equivalence mode. In the positional mode the parameters should be specified in the sequence as listed above including extra comma's for not specified preceding parameters. In the equivalence mode the sequence of the parameters is not important. The parameter DEFAULTS may be specified anywhere, regardless of which type of mode is applied. During primary initialization of parameters describing the construction of a specific database all necessary parameters for a program, as far as not specified at execution time, are requested by DAPHNE interactively, as already explained in section 6. There are two exceptions on this rule: a. specifying the parameter DEFAULTS (as one of the parameters) on the run command line causes the interactive questions for other parameters to be suppressed. Parameters not being specified will take default specifications as indicated in the review above. The L(abel) parameter, however, misses a default specification if running REVISE, EXAMINE, REFORM or REVIEW and therefore is always obliged. If it is not included in the run command DAPHNE will always ask for it, regardless of the DEFAULTS specification. b. running REFORM or REVIEW in a batch job (not connected to the terminal) requires all parameters to be given at execution time, because there will be no interaction between DAPHNE and a user at all. Parameters not being specified, even if DEFAULTS is not specified, take default specifications. The only parameter being obliged is the L parameter: without it DAPHNE aborts. When calling one of DAPHNE's programs subsequently and a memory file already has been created formerly, specifying parameters will generally temporarily override the corresponding parameters already set within the memory file. Parameters not specified at execution time are already known within the memory file. This prevents the interactive prompt for those parameters, just as the parameter DEFAULTS would suppress it. Hence, indicating DEFAULTS when having a memory file available is rather redundant and may as well be omitted. However, within REFORM DEFAULTS causes another specific action, which can be explained as follows: generally within all programs the new data file (which is to be written or extended) initially is read and positioned at the end-of-file. Similarly the old data file (which is to be read only) initially is positioned at the begin-of-file. Within REFORM these initializations are also performed unless DEFAULTS has been declared in the run command. Hence, in this special case DAPHNE copies (reads and writes) the old database to the new one from the points where both files were positioned prior to running REFORM. Additional temporary parameter specifications may differ with regard to their effects depending on other specifications. With ASKDATA and REVISE specifying the W(rite format) parameter without the N(ewdata) parameter causes the specified format to be used only during initial reading of the (new) database from the begin-of-file to the end-of-file. Extending this database is accomplished using the write format specified during initialization of the memory file. This feature was added to DAPHNE because FORTRAN read and write fixed formats for the same database are not necessarily equal (e.g. a write format, which also includes a character constant). In order to avoid read errors with a specific write format, a temporary read format may be thus specified, enabling DAPHNE to properly read and count cases from an already existing database. If this feature is not used and reading the database with the write format causes read errors, DAPHNE automatically recovers from them by rewinding the database and reread it, using a standard format suitable for reading any text. In this way the end-of-file is reached without reading errors. However, the data then are not read as data and if the user does not apply a memory file it is impossible to enter the values of the last case of the database into the memory. This method only enables DAPHNE to count the number of cases, which she can do because all cases are seperated from each other by end-of-records, automatically added to the cases when they are written by DAPHNE. After completing a database the user may remove the end-of-records, if he desires (outside DAPHNE, using the system COMBINE command). If one wants to override the write format of the memory file during the whole run with another format, besides the W parameter the N parameter should be specified too, even if it is the same one as known within the memory file. This method causes the externally specified write format to remain effective during both reading from and writing to the (new) database. A similar method does not apply with the old database and its corresponding read format, because there is no need for it. Specifying one or both of them at execution time makes them effective during the whole run. 12. GENESIS Some years ago I very much needed an interactive data entry program, because I had many data (on paper) concerning hundreds of cases and more than 3000 variables, which should be analysed with SPSS. As I only had access to one mainframe computer at the time (via a remote terminal station), I checked whether there was any data entry program available meeting my needs. It appeared that there was no such program known. I was recommended to enter my data from the terminal in the form of punch documents using a text editor. That of course was not my intention. So I decided to write my own data entry program. A real FORTRAN programmer like me should shrink from nothing and I began cheerfully with the job. Already soon it appeared that it would take rather much time and effort, more than I initially had expected. But there was no way back anymore, the beginning had been made. And so it grew larger and larger and I had to take care for making it not too cumbersome and keeping it well structured. Anyway, I am satisfied with the result. My own demands are met (though the demands and the actual possibilities tend to cover each other in the course of time). DAPHNE takes more and more definitive shapes, her childhood diseases have been overcome. Yet she will always remain subject to modification and improvement. Though DAPHNE's possibilities may sometimes look rather complicated, in practice her use will be quite easy and simple. She guides the ignorant user as much as possible and suites the more experienced user by allowing him to apply quick and condensed command forms. Communicating with her is quite logical, user friendly, and resembles plain English rather well. 13. DEVELOPMENTS Some future features briefly are: - handling alphanumerical values as well - displaying (user defined) value labels as well - additional editing commands - automatic repetition of last command sequence - application of macro command structures - changing initialized parameters within the memory file - changing file names and other parameters during run - DAPHNE/PC version, storing double precision values