STATIST Manual
Contents
1 Introduction
2 Installation
2.1 From source
2.2 Using binaries
3 Invocation
4 Menu
5 Statist and Gnuplot
5.1 Saving graphics
5.2 Box-plot
6 Data
7 Batch/Script
8 Useful Tips
1 Introduction
Statist is an easy to use, light weight statistics program. It
is the ideal tool for people who don't want memorize commands.
Everything is in an interactive menu: you have just to choose what you
need.
Statist is open source software, and, like all free programs, it
comes with absolutely no guarantee.
2 Installation
2.1 From source
- Open a terminal.
- Unpack the source code, compile the program, and become root to
install it. That is, type:
tar -xvzf statist-1.0.2.tar.gz
cd statist-1.0.2
make
su
password:
make install
exit
2.2 Using binaries
3 Invocation
You can simply type:
statist data_file
However there are also some options that you might find useful:
statist [-help -silent -log -nobell -nofile -noplot
-thist --bernhard] data_file
The only one that you have to memorize is -h, which gives you the
following help text:
Options:
-h, -help, -? : print this help message and exit
-silent : don't print menu etc. (for batch/script usage)
-log : write results to log file `statist.log'
-nofile : don't read a data file when starting the program
-nobell : no beep at errors and warnings
-thist : histogram as text graphic instead of gnuplot-graphic
-noplot : no gnuplot-graphic
--bernhard : special output changes from Bernhard, i.e.:
- table output at Miscellaneous/Standard deviation
- if -noplot defined no text histogram at
Miscellaneous/Standard deviation
4 Menu
The program has a simple menu that makes it very easy to use. There
is no need of remembering commands. Each menu entry starts with a
digit. `0' always leads to the next higher menu-level and,
consistently, finishes the program if you already are in the Main
menu. One tip is important: if you choose a menu entry by mistake, you
can always cancel the process by pressing the <Return> key before
entering any value or answering any question. Doing that, the last menu
will be printed again.
If you choose a statistical procedure from the menu, you will be
asked to choose the variables. Often, it's not necessary to type the
entire name of a column when inputing variable names for analyzes. For
example, if you have a column labeled
this_really_is_a_big_name
and there is no other column starting with the letter `t', you can simply
type `t'.
Actually, the whole process is self-explanatory, and you would be
able to use the program even without reading this short explanation.
5 Statist and Gnuplot
Gnuplot is an interactive program that make graphical presentations
from data and functions. If Statist is runing under Unix, then certain
functions of statist create gnuplot graphics. The prerequisite
is simply that the gnuplot is installed and is in the PATH.
Furthermore, if you know gnuplot syntax, you can refine or
personalize your graphics, inputing gnuplot commands. To do that,
choose, the menu option Miscellaneous - Enter gnuplot commands (only
under Unix). The gnuplot graphics can be disabled invoking the program
with the option -noplot. This can be useful if you, for example, will
work with batch processing or if your database is too big and, thus, gnuplot
graphics are being generated too slowly.
5.1 Saving graphics
To save a graphic created by gnuplot, you also use a program like Gimp or
KSnapShot, but there are also other ways of saving graphics. For example, open
open a terminal, and be sure that the terminal window and the gnuplot window
don't overlap. Then, type:
import name.png
The cursor will become a cross. Then, click on the graph (not on the
gnuplot window title bar), and the graphic will be saved as name.png.
To create a graphic in a different format, choose something else
instead of .png (.gif, .bmp, etc). The import program is part of
ImageMagik package.
5.2 Box-plot
You probably will have no problem interpreting Statist graphics. The
only one that might need some explanation is the Box-and-Whisker
Plot. The picture below shows the meaning of each piece of this graphic:
6 Data
Statist reads data from simple ASCII files (text files). Either you
invoke the program with an ASCII file, or the program immediately asks
for the the name of a data-file. Without data-file, there is nothing to
do, unless you declare the option -nofile while invoking the program in
order to input the data over the keyboard directly (choose from the
menu: Data management - Read column from terminal). However, only
rarely it is reasonable to do this.
A data-file consists of one or several columns of data. Currently
the database can have at most 60 columns, but we plan to eliminate any
limit for the number of columns in future releases of Statist. The
columns of numbers must be separated from each other by either tab
character or empty spaces. Missing values must be indicated by the
capital letter `M'. Below is an example of data-file:
#Example data-file for statist
1 3 5 6
7 8 9 10
11 12 13 14
15 M 16 M
As you can infer from the above example, commentaries begin with the
symbol `#' and are ignored. Empty-lines are also ignored. When Statist
reads the data file, to each column is assigned one variable. The first
column will be column `a', the second will be `b', etc. However, in
order to keep the understandability of data files with many variables,
it is also possible to give more meaningful names to columns. That has
the advantage that you will no more be obliged to remember to what
column a certain variable corresponds. To do that, begin a line with
the "#%". Like commentary lines, the line must begin with one `#', but
this symbol must be followed by one `%'. Then, the names will be
assigned to the lines as follows:
#% kow kaw ec50
0.34 4.56 0.23
1.23 5.45 6.76
6.78 1.34 9.60
The number of variable labels declared must be exactly the same as
the number of columns. Only letters, digits, and `_' are allowed to be
used in labels. Each column of the database is saved is a temporary
binary file in /tmp, where all values are stored as double precision
floating point numbers (real numbers). These files are erased when you
quit Statist. The missing values are stored as the smallest possible
number, that is: 1.7977 * 10308. You have to be sure that
this number isn't in your data file as a valid number.
Before each analysis, Statist reads the selected columns from temporary
files into ram, and, if necessary, either deletes the rows that have at
least one missing value or simply deletes missing values. However, the
deletions occur only in a copy of the temporary files that is created
in the computer memory. The temporary files remain intact until you
quit the program. For example the menu option Regressions and
correlations - Multiple linear correlation will delete all rows that
have missing values in any one of the chosen columns. Do this analysis
if each row in your database represents a single case, what is very
common in social sciences. The menu option Tests - t-test for
comparison of two means of two sets will delete every missing value,
but a missing value in a column will not cause the entire row to be
deleted. Use this option if, for example, you have two series of
experiments and want to compare the two sets of results.
If you want to work only with subsets of your database, you can write columns
into a text file (ASCII file), choosing the menu option Data
Management - Export columns as ASCII-data. You can also read data from
several files simultaneously (Data Management - Read another file).
This options is useful if you use data from different sources. When
you Read another file, new columns are added to the database, and if a
column label in the new file is already in use in the current database,
the symbol "_" will be appended to it.
Another possibility is to join columns (Data manipulation - Join
columns). In this case, the selected columns will be concatenated in a
bigger one.
7 Batch/Script
If you have to repeat many times the same analysis, you would became
bored of starting Statist, and, again and again, choosing the same
options from the menu. If this is your case, you can use the batch
mode. You have to invoke Statist with the option -silent, and give to
it a file containing what you would have to type if Statist was running
in the normal mode. For example, if you want to run a correlation
between variables "a" and "b" in a data file called "day365.dat" you
could create a file named, for example, "autopilot" with the following
content:
2
1
a
b
0
0
The next step would be to invoke statist with the following command:
statist -silent -noplot day365.dat < autopilot
The result will be printed in the screen. However, if you prefer the
results saved in a file called, say, report365, type:
statist -silent -noplot day365.dat < autopilot > report365
8 Useful Tips
- If you choose the option List data from one column only to
discover that the column is too big, you can interrupt the listing by
typing any letter and, then, <Enter><Enter>.
- Please, report any problem that you find (program bugs, documentation faults, etc...) to statist-list@itevation.de
- You can get the last version of Statist on its website:
http://www.usf.uos.de/ breiter/tools/statist/index.en.html.
File translated from
TEX
by
TTH,
version 3.67.
On 15 Mar 2005, 10:34.