STATIST Manual

Contents

1  Introduction
2  Installation
    2.1  From source
    2.2  Using binaries
3  Invocation
4  Menu
5  Statist and Gnuplot
    5.1  Saving graphics
    5.2  Box-plot
6  Data
7  Batch/Script
8  Useful Tips

1  Introduction

Statist is an easy to use, light weight statistics program. It is the ideal tool for people who don't want memorize commands. Everything is in an interactive menu: you have just to choose what you need.
Statist is open source software, and, like all free programs, it comes with absolutely no guarantee.

2  Installation

2.1  From source

  1. Open a terminal.
  2. Unpack the source code, compile the program, and become root to install it. That is, type:
    tar -xvzf statist-1.0.2.tar.gz
    cd statist-1.0.2
    make
    su
    password:
    make install
    exit

2.2  Using binaries

3  Invocation

You can simply type:
    statist data_file

However there are also some options that you might find useful:
    statist [-help -silent -log -nobell -nofile -noplot 
             -thist --bernhard] data_file

The only one that you have to memorize is -h, which gives you the following help text:
Options:
-h, -help, -? : print this help message and exit
-silent       : don't print menu etc. (for batch/script usage)
-log          : write results to log file `statist.log'
-nofile       : don't read a data file when starting the program
-nobell       : no beep at errors and warnings
-thist        : histogram as text graphic instead of gnuplot-graphic
-noplot       : no gnuplot-graphic
--bernhard    : special output changes from Bernhard, i.e.:
                - table output at Miscellaneous/Standard deviation
                - if -noplot defined no text histogram at
                      Miscellaneous/Standard deviation

4  Menu

The program has a simple menu that makes it very easy to use. There is no need of remembering commands. Each menu entry starts with a digit. `0' always leads to the next higher menu-level and, consistently, finishes the program if you already are in the Main menu. One tip is important: if you choose a menu entry by mistake, you can always cancel the process by pressing the <Return> key before entering any value or answering any question. Doing that, the last menu will be printed again.
If you choose a statistical procedure from the menu, you will be asked to choose the variables. Often, it's not necessary to type the entire name of a column when inputing variable names for analyzes. For example, if you have a column labeled
this_really_is_a_big_name
and there is no other column starting with the letter `t', you can simply type `t'.
Actually, the whole process is self-explanatory, and you would be able to use the program even without reading this short explanation.

5  Statist and Gnuplot

Gnuplot is an interactive program that make graphical presentations from data and functions. If Statist is runing under Unix, then certain functions of statist create gnuplot graphics. The prerequisite is simply that the gnuplot is installed and is in the PATH.
Furthermore, if you know gnuplot syntax, you can refine or personalize your graphics, inputing gnuplot commands. To do that, choose, the menu option Miscellaneous - Enter gnuplot commands (only under Unix). The gnuplot graphics can be disabled invoking the program with the option -noplot. This can be useful if you, for example, will work with batch processing or if your database is too big and, thus, gnuplot graphics are being generated too slowly.

5.1  Saving graphics

To save a graphic created by gnuplot, you also use a program like Gimp or KSnapShot, but there are also other ways of saving graphics. For example, open open a terminal, and be sure that the terminal window and the gnuplot window don't overlap. Then, type:
   import name.png

The cursor will become a cross. Then, click on the graph (not on the gnuplot window title bar), and the graphic will be saved as name.png. To create a graphic in a different format, choose something else instead of .png (.gif, .bmp, etc). The import program is part of ImageMagik package.

5.2  Box-plot

You probably will have no problem interpreting Statist graphics. The only one that might need some explanation is the Box-and-Whisker Plot. The picture below shows the meaning of each piece of this graphic:

6  Data

Statist reads data from simple ASCII files (text files). Either you invoke the program with an ASCII file, or the program immediately asks for the the name of a data-file. Without data-file, there is nothing to do, unless you declare the option -nofile while invoking the program in order to input the data over the keyboard directly (choose from the menu: Data management - Read column from terminal). However, only rarely it is reasonable to do this.
A data-file consists of one or several columns of data. Currently the database can have at most 60 columns, but we plan to eliminate any limit for the number of columns in future releases of Statist. The columns of numbers must be separated from each other by either tab character or empty spaces. Missing values must be indicated by the capital letter `M'. Below is an example of data-file:
#Example data-file for statist  
  1  3  5  6  
  7  8  9 10  
 11 12 13 14  
 15  M 16  M  

As you can infer from the above example, commentaries begin with the symbol `#' and are ignored. Empty-lines are also ignored. When Statist reads the data file, to each column is assigned one variable. The first column will be column `a', the second will be `b', etc. However, in order to keep the understandability of data files with many variables, it is also possible to give more meaningful names to columns. That has the advantage that you will no more be obliged to remember to what column a certain variable corresponds. To do that, begin a line with the "#%". Like commentary lines, the line must begin with one `#', but this symbol must be followed by one `%'. Then, the names will be assigned to the lines as follows:
#% kow kaw ec50  
0.34 4.56 0.23
1.23 5.45 6.76
6.78 1.34 9.60

The number of variable labels declared must be exactly the same as the number of columns. Only letters, digits, and `_' are allowed to be used in labels. Each column of the database is saved is a temporary binary file in /tmp, where all values are stored as double precision floating point numbers (real numbers). These files are erased when you quit Statist. The missing values are stored as the smallest possible number, that is: 1.7977 * 10308. You have to be sure that this number isn't in your data file as a valid number.
Before each analysis, Statist reads the selected columns from temporary files into ram, and, if necessary, either deletes the rows that have at least one missing value or simply deletes missing values. However, the deletions occur only in a copy of the temporary files that is created in the computer memory. The temporary files remain intact until you quit the program. For example the menu option Regressions and correlations - Multiple linear correlation will delete all rows that have missing values in any one of the chosen columns. Do this analysis if each row in your database represents a single case, what is very common in social sciences. The menu option Tests - t-test for comparison of two means of two sets will delete every missing value, but a missing value in a column will not cause the entire row to be deleted. Use this option if, for example, you have two series of experiments and want to compare the two sets of results.
If you want to work only with subsets of your database, you can write columns into a text file (ASCII file), choosing the menu option Data Management - Export columns as ASCII-data. You can also read data from several files simultaneously (Data Management - Read another file). This options is useful if you use data from different sources. When you Read another file, new columns are added to the database, and if a column label in the new file is already in use in the current database, the symbol "_" will be appended to it.
Another possibility is to join columns (Data manipulation - Join columns). In this case, the selected columns will be concatenated in a bigger one.

7  Batch/Script

If you have to repeat many times the same analysis, you would became bored of starting Statist, and, again and again, choosing the same options from the menu. If this is your case, you can use the batch mode. You have to invoke Statist with the option -silent, and give to it a file containing what you would have to type if Statist was running in the normal mode. For example, if you want to run a correlation between variables "a" and "b" in a data file called "day365.dat" you could create a file named, for example, "autopilot" with the following content:
2
1
a
b
0
0

The next step would be to invoke statist with the following command:
   statist -silent -noplot day365.dat < autopilot

The result will be printed in the screen. However, if you prefer the results saved in a file called, say, report365, type:
   statist -silent -noplot day365.dat < autopilot > report365

8  Useful Tips

http://www.usf.uos.de/ breiter/tools/statist/index.en.html.



File translated from TEX by TTH, version 3.67.
On 15 Mar 2005, 10:34.