Copyright (C) 1986 - 1993, 1998 Thomas Williams, Colin Kelley
Permission to use, copy, and distribute this software and its documentation for any purpose with or without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation.
Permission to modify the software is granted, but not the right to distribute the complete modified source code. Modifications are to be distributed as patches to the released version. Permission to distribute binaries produced by compiling modified sources is granted, provided you
1. distribute the corresponding source modifications from the released version in the form of a patch file along with the binaries, 2. add special version identification to distinguish your version in addition to the base release version number, 3. provide your name and address as the primary contact for the support of your modified version, and 4. retain our contact information in regard to use of the base software.
Permission to distribute the released version of the source code along with corresponding source modifications in the form of a patch file is granted with same provisions 2 through 4 for binary distributions.
This software is provided "as is" without express or implied warranty to the extent permitted by applicable law.
AUTHORS
Original Software: Thomas Williams, Colin Kelley.
Gnuplot 2.0 additions: Russell Lang, Dave Kotz, John Campbell.
Gnuplot 3.0 additions: Gershon Elber and many others.
`gnuplot` is a command-driven interactive function and data plotting program. It is case sensitive (commands and function names written in lowercase are not the same as those written in CAPS). All command names may be abbreviated as long as the abbreviation is not ambiguous. Any number of commands may appear on a line (with the exception that section 2.8 load or section 2.2 call must be the final command), separated by semicolons (;). Strings are indicated with quotes. They may be either single or double quotation marks, e.g.,
load "filename" cd 'dir'
although there are some subtle differences (see `syntax` for more details).
Any command-line arguments are assumed to be names of files containing `gnuplot` commands, with the exception of standard X11 arguments, which are processed first. Each file is loaded with the section 2.8 load command, in the order specified. `gnuplot` exits after the last file is processed. When no load files are named, `gnuplot` enters into an interactive mode. The special filename "-" is used to denote standard input. See "help batch/interactive" for more details.
Many `gnuplot` commands have multiple options. These options must appear in the proper order, although unwanted ones may be omitted in most cases. Thus if the entire command is "command a b c", then "command a c" will probably work, but "command c a" will fail.
Commands may extend over several input lines by ending each line but the last with a backslash (\). The backslash must be the _last_ character on each line. The effect is as if the backslash and newline were not there. That is, no white space is implied, nor is a comment terminated. Therefore, commenting out a continued line comments out the entire command (see `comment`). But note that if an error occurs somewhere on a multi-line command, the parser may not be able to locate precisely where the error is and in that case will not necessarily point to the correct line.
In this document, curly braces ({}) denote optional arguments and a vertical bar (|) separates mutually exclusive choices. `gnuplot` keywords or section 2.6 help topics are indicated by backquotes or `boldface` (where available). Angle brackets (<>) are used to mark replaceable tokens. In many cases, a default value of the token will be taken for optional arguments if the token is omitted, but these cases are not always denoted with braces around the angle brackets.
For on-line help on any topic, type section 2.6 help followed by the name of the topic or just section 2.6 help or `?` to get a menu of available topics.
The new `gnuplot` user should begin by reading about `plotting` (if on-line, type `help plotting`). Simple Plots Demo
There is a mailing list for `gnuplot` users. Note, however, that the newsgroup
comp.graphics.apps.gnuplot
is identical to the mailing list (they both carry the same set of messages). We prefer that you read the messages through the newsgroup rather than subscribing to the mailing list. Administrative requests should be sent to
majordomo@dartmouth.edu
Send a message with the body (not the subject) consisting of the single word "help" (without the quotes) for more details.
The address for mailing to list members is:
info-gnuplot@dartmouth.edu
Bug reports and code contributions should be mailed to:
bug-gnuplot@dartmouth.edu
The list of those interested in beta-test versions is:
info-gnuplot-beta@dartmouth.edu
There is also a World Wide Web page with up-to-date information, including known bugs: http://www.cs.dartmouth.edu/gnuplot_info.html
Before seeking help, please check the FAQ (Frequently Asked Questions) list. If you do not have a copy of the FAQ, you may request a copy by email from the Majordomo address above, ftp a copy from
ftp://ftp.ucc.ie/pub/gnuplot/faq, ftp://ftp.gnuplot.vt.edu/pub/gnuplot/faq,
or see the WWW `gnuplot` page.
When posting a question, please include full details of the version of `gnuplot`, the machine, and operating system you are using. A _small_ script demonstrating the problem may be useful. Function plots are preferable to datafile plots. If email-ing to info-gnuplot, please state whether or not you are subscribed to the list, so that users who use news will know to email a reply to you. There is a form for such postings on the WWW site.
Gnuplot version 3.7 contains many new features. This section gives a partial list and links to the new items in no particular order.
1. `fit f(x) 'file' via` uses the Marquardt-Levenberg method to fit data. (This is only slightly different from the `gnufit` patch available for 3.5.)
2. Greatly expanded section 2.10.1.7 using command. See section 2.10.1.7 using.
3. section 2.18.54 timefmt allows for the use of dates as input and output for time series plots. See `Time/Date data` and timedat.dem.
4. Multiline labels and font selection in some drivers.
5. Minor (unlabeled) tics. See section 2.18.33 mxtics.
6. section 2.18.22 key options for moving the key box in the page (and even outside of the plot), putting a title on it and a box around it, and more. See section 2.18.22 key.
7. Multiplots on a single logical page with section 2.18.31 multiplot.
8. Enhanced `postscript` driver with super/subscripts and font changes. (This was a separate driver (`enhpost`) that was available as a patch for 3.5.)
9. Second axes: use the top and right axes independently of the bottom and left, both for plotting and labels. See section 2.10 plot.
10. Special datafile names `'-'` and `""`. See section 2.10.1.5 special-filenames.
11. Additional coordinate systems for labels and arrows. See `coordinates`.
12. section 2.18.46 size can try to plot with a specified aspect ratio.
13. section 2.18.30 missing now treats missing data correctly.
14. The section 2.2 call command: section 2.8 load with arguments.
15. More flexible `range` commands with `reverse` and `writeback` keywords.
16. section 2.18.15 encoding for multi-lingual encoding.
17. New `x11` driver with persistent and multiple windows.
18. New plotting styles: section 2.18.47.15 xerrorbars, section 2.18.47.8 histeps, section 2.18.47.6 financebars and more. See section 2.18.47 style.
19. New tic label formats, including `"%l %L"` which uses the mantissa and exponents to a given base for labels. See `set format`.
20. New drivers, including `cgm` for inclusion into MS-Office applications and `gif` for serving plots to the WEB.
21. Smoothing and spline-fitting options for section 2.10 plot. See section 2.10.1.4 smooth.
22. section 2.18.29 margin and section 2.18.38 origin give much better control over where a graph appears on the page.
23. section 2.18.6 border now controls each border individually.
24. The new commands section 2.7 if and section 2.15 reread allow command loops.
25. Point styles and sizes, line types and widths can be specified on the section 2.10 plot command. Line types and widths can also be specified for grids, borders, tics and arrows. See section 2.10.6 with. Furthermore these types may be combined and stored for further use. See section 2.18.24 linestyle.
26. Text (labels, tic labels, and the time stamp) can be written vertically by those terminals capable of doing so.
`gnuplot` may be executed in either batch or interactive modes, and the two may even be mixed together on many systems.
Any command-line arguments are assumed to be names of files containing `gnuplot` commands (with the exception of standard X11 arguments, which are processed first). Each file is loaded with the section 2.8 load command, in the order specified. `gnuplot` exits after the last file is processed. When no load files are named, `gnuplot` enters into an interactive mode. The special filename "-" is used to denote standard input.
Both the section 2.4 exit and section 2.13 quit commands terminate the current command file and section 2.8 load the next one, until all have been processed.
Examples:
To launch an interactive session:
gnuplot
To launch a batch session using two command files "input1" and "input2":
gnuplot input1 input2
To launch an interactive session after an initialization file "header" and followed by another command file "trailer":
gnuplot header - trailer
Command-line editing is supported by the Unix, Atari, VMS, MS-DOS and OS/2 versions of `gnuplot`. Also, a history mechanism allows previous commands to be edited and re-executed. After the command line has been edited, a newline or carriage return will enter the entire line without regard to where the cursor is positioned.
(The readline function in `gnuplot` is not the same as the readline used in GNU Bash and GNU Emacs. If the GNU version is desired, it may be selected instead of the `gnuplot` version at compile time.)
The editing commands are as follows:
`Line-editing`:
^B moves back a single character. ^F moves forward a single character. ^A moves to the beginning of the line. ^E moves to the end of the line. ^H and DEL delete the previous character. ^D deletes the current character. ^K deletes from current position to the end of line. ^L,^R redraws line in case it gets trashed. ^U deletes the entire line. ^W deletes the last word.
`History`:
^P moves back through history. ^N moves forward through history.
On the IBM PC, the use of a TSR program such as DOSEDIT or CED may be desired for line editing. The default makefile assumes that this is the case; by default `gnuplot` will be compiled with no line-editing capability. If you want to use `gnuplot`'s line editing, set READLINE in the makefile and add readline.obj to the link file. The following arrow keys may be used on the IBM PC and Atari versions if readline is used:
Left Arrow - same as ^B. Right Arrow - same as ^F. Ctrl Left Arrow - same as ^A. Ctrl Right Arrow - same as ^E. Up Arrow - same as ^P. Down Arrow - same as ^N.
The Atari version of readline defines some additional key aliases:
Undo - same as ^L. Home - same as ^A. Ctrl Home - same as ^E. Esc - same as ^U. Help - section 2.6 help plus return. Ctrl Help - `help `.
Comments are supported as follows: a `#` may appear in most places in a line and `gnuplot` will ignore the rest of the line. It will not have this effect inside quotes, inside numbers (including complex numbers), inside command substitutions, etc. In short, it works anywhere it makes sense to work.
The commands section 2.18.2 arrow, section 2.18.22 key, and section 2.18.23 label allow you to draw something at an arbitrary position on the graph. This position is specified by the syntax:
{<system>} <x>, {<system>} <y> {,{<system>} <z>}
Each <system> can either be `first`, `second`, `graph` or `screen`.
`first` places the x, y, or z coordinate in the system defined by the left and bottom axes; `second` places it in the system defined by the second axes (top and right); `graph` specifies the area within the axes--0,0 is bottom left and 1,1 is top right (for splot, 0,0,0 is bottom left of plotting area; use negative z to get to the base--see section 2.18.51 ticslevel); and `screen` specifies the screen area (the entire area--not just the portion selected by section 2.18.46 size), with 0,0 at bottom left and 1,1 at top right.
If the coordinate system for x is not specified, `first` is used. If the system for y is not specified, the one used for x is adopted.
If one (or more) axis is timeseries, the appropriate coordinate should be given as a quoted time string according to the section 2.18.54 timefmt format string. See section 2.18.70 xdata and section 2.18.54 timefmt. `gnuplot` will also accept an integer expression, which will be interpreted as seconds from 1 January 2000.
A number of shell environment variables are understood by `gnuplot`. None of these are required, but may be useful.
If GNUTERM is defined, it is used as the name of the terminal type to be used. This overrides any terminal type sensed by `gnuplot` on start-up, but is itself overridden by the .gnuplot (or equivalent) start-up file (see `start-up`) and, of course, by later explicit changes.
On Unix, AmigaOS, AtariTOS, MS-DOS and OS/2, GNUHELP may be defined to be the pathname of the HELP file (gnuplot.gih).
On VMS, the logical name GNUPLOT$HELP should be defined as the name of the help library for `gnuplot`. The `gnuplot` help can be put inside any system help library, allowing access to help from both within and outside `gnuplot` if desired.
On Unix, HOME is used as the name of a directory to search for a .gnuplot file if none is found in the current directory. On AmigaOS, AtariTOS, MS-DOS and OS/2, gnuplot is used. On VMS, SYS$LOGIN: is used. See `help start-up`.
On Unix, PAGER is used as an output filter for help messages.
On Unix, AtariTOS and AmigaOS, SHELL is used for the section 2.19 shell command. On MS-DOS and OS/2, COMSPEC is used for the section 2.19 shell command.
On MS-DOS, if the BGI or Watcom interface is used, PCTRM is used to tell the maximum resolution supported by your monitor by setting it to S<max. horizontal resolution>. E.g. if your monitor's maximum resolution is 800x600, then use:
set PCTRM=S800
If PCTRM is not set, standard VGA is used.
FIT_SCRIPT may be used to specify a `gnuplot` command to be executed when a fit is interrupted--see `fit`. FIT_LOG specifies the filename of the logfile maintained by fit.
In general, any mathematical expression accepted by C, FORTRAN, Pascal, or BASIC is valid. The precedence of these operators is determined by the specifications of the C programming language. White space (spaces and tabs) is ignored inside expressions.
Complex constants are expressed as {<real>,<imag>}, where <real> and <imag> must be numerical constants. For example, {3,2} represents 3 + 2i; {0,1} represents 'i' itself. The curly braces are explicitly required here.
Note that gnuplot uses both "real" and "integer" arithmetic, like FORTRAN and C. Integers are entered as "1", "-10", etc; reals as "1.0", "-10.0", "1e1", 3.5e-1, etc. The most important difference between the two forms is in division: division of integers truncates: 5/2 = 2; division of reals does not: 5.0/2.0 = 2.5. In mixed expressions, integers are "promoted" to reals before evaluation: 5/2e0 = 2.5. The result of division of a negative integer by a positive one may vary among compilers. Try a test like "print -5/2" to determine if your system chooses -2 or -3 as the answer.
The integer expression "1/0" may be used to generate an "undefined" flag, which causes a point to ignored; the `ternary` operator gives an example.
The real and imaginary parts of complex expressions are always real, whatever the form in which they are entered: in {3,2} the "3" and "2" are reals, not integers.
The functions in `gnuplot` are the same as the corresponding functions in the Unix math library, except that all functions accept integer, real, and complex arguments, unless otherwise noted.
For those functions that accept or return angles that may be given in either degrees or radians (sin(x), cos(x), tan(x), asin(x), acos(x), atan(x), atan2(x) and arg(z)), the unit may be selected by section 2.18.1 angles, which defaults to radians.
The `abs(x)` function returns the absolute value of its argument. The returned value is of the same type as the argument.
For complex arguments, abs(x) is defined as the length of x in the complex plane [i.e., sqrt(real(x)**2 + imag(x)**2) ].
The `acos(x)` function returns the arc cosine (inverse cosine) of its argument. `acos` returns its argument in radians or degrees, as selected by section 2.18.1 angles.
The `acosh(x)` function returns the inverse hyperbolic cosine of its argument in radians.
The `arg(x)` function returns the phase of a complex number in radians or degrees, as selected by section 2.18.1 angles.
The `asin(x)` function returns the arc sin (inverse sin) of its argument. `asin` returns its argument in radians or degrees, as selected by section 2.18.1 angles.
The `asinh(x)` function returns the inverse hyperbolic sin of its argument in radians.
The `atan(x)` function returns the arc tangent (inverse tangent) of its argument. `atan` returns its argument in radians or degrees, as selected by section 2.18.1 angles.
The `atan2(y,x)` function returns the arc tangent (inverse tangent) of the ratio of the real parts of its arguments. section 1.10.1.8 atan2 returns its argument in radians or degrees, as selected by section 2.18.1 angles, in the correct quadrant.
The `atanh(x)` function returns the inverse hyperbolic tangent of its argument in radians.
The `besj0(x)` function returns the j0th Bessel function of its argument. section 1.10.1.10 besj0 expects its argument to be in radians.
The `besj1(x)` function returns the j1st Bessel function of its argument. section 1.10.1.11 besj1 expects its argument to be in radians.
The section 1.10.1.12 besy0 function returns the y0th Bessel function of its argument. section 1.10.1.12 besy0 expects its argument to be in radians.
The `besy1(x)` function returns the y1st Bessel function of its argument. section 1.10.1.13 besy1 expects its argument to be in radians.
The `ceil(x)` function returns the smallest integer that is not less than its argument. For complex numbers, section 1.10.1.14 ceil returns the smallest integer not less than the real part of its argument.
The `cos(x)` function returns the cosine of its argument. `cos` accepts its argument in radians or degrees, as selected by section 2.18.1 angles.
The `cosh(x)` function returns the hyperbolic cosine of its argument. section 1.10.1.16 cosh expects its argument to be in radians.
The `erf(x)` function returns the error function of the real part of its argument. If the argument is a complex value, the imaginary component is ignored.
The `erfc(x)` function returns 1.0 - the error function of the real part of its argument. If the argument is a complex value, the imaginary component is ignored.
The `exp(x)` function returns the exponential function of its argument (`e` raised to the power of its argument). On some implementations (notably suns), exp(-x) returns undefined for very large x. A user-defined function like safe(x) = x<-100 ? 0 : exp(x) might prove useful in these cases.
The `floor(x)` function returns the largest integer not greater than its argument. For complex numbers, section 1.10.1.20 floor returns the largest integer not greater than the real part of its argument.
The `gamma(x)` function returns the gamma function of the real part of its argument. For integer n, gamma(n+1) = n!. If the argument is a complex value, the imaginary component is ignored.
The `ibeta(p,q,x)` function returns the incomplete beta function of the real parts of its arguments. p, q > 0 and x in [0:1]. If the arguments are complex, the imaginary components are ignored.
The `inverf(x)` function returns the inverse error function of the real part of its argument.
The `igamma(a,x)` function returns the incomplete gamma function of the real parts of its arguments. a > 0 and x >= 0. If the arguments are complex, the imaginary components are ignored.
The `imag(x)` function returns the imaginary part of its argument as a real number.
The `invnorm(x)` function returns the inverse normal distribution function of the real part of its argument.
The `int(x)` function returns the integer part of its argument, truncated toward zero.
The `lgamma(x)` function returns the natural logarithm of the gamma function of the real part of its argument. If the argument is a complex value, the imaginary component is ignored.
The `log(x)` function returns the natural logarithm (base `e`) of its argument.
The `log10(x)` function returns the logarithm (base 10) of its argument.
The `norm(x)` function returns the normal distribution function (or Gaussian) of the real part of its argument.
The `rand(x)` function returns a pseudo random number in the interval [0:1] using the real part of its argument as a seed. If seed < 0, the sequence is (re)initialized. If the argument is a complex value, the imaginary component is ignored.
The `real(x)` function returns the real part of its argument.
The `sgn(x)` function returns 1 if its argument is positive, -1 if its argument is negative, and 0 if its argument is 0. If the argument is a complex value, the imaginary component is ignored.
The `sin(x)` function returns the sine of its argument. `sin` expects its argument to be in radians or degrees, as selected by section 2.18.1 angles.
The `sinh(x)` function returns the hyperbolic sine of its argument. section 1.10.1.36 sinh expects its argument to be in radians.
The `sqrt(x)` function returns the square root of its argument.
The `tan(x)` function returns the tangent of its argument. `tan` expects its argument to be in radians or degrees, as selected by section 2.18.1 angles.
The `tanh(x)` function returns the hyperbolic tangent of its argument. section 1.10.1.39 tanh expects its argument to be in radians.
A few additional functions are also available.
`column(x)` may be used only in expressions as part of section 2.10.1.7 using manipulations to fits or datafile plots. See section 2.10.1.7 using.
The section 1.10.1.41 tm_hour function interprets its argument as a time, in seconds from 1 Jan 2000. It returns the hour (an integer in the range 0--23) as a real.
The section 1.10.1.42 tm_mday function interprets its argument as a time, in seconds from 1 Jan 2000. It returns the day of the month (an integer in the range 1--31) as a real.
The section 1.10.1.43 tm_min function interprets its argument as a time, in seconds from 1 Jan 2000. It returns the minute (an integer in the range 0--59) as a real.
The section 1.10.1.44 tm_mon function interprets its argument as a time, in seconds from 1 Jan 2000. It returns the month (an integer in the range 1--12) as a real.
The section 1.10.1.45 tm_sec function interprets its argument as a time, in seconds from 1 Jan 2000. It returns the second (an integer in the range 0--59) as a real.
The section 1.10.1.46 tm_wday function interprets its argument as a time, in seconds from 1 Jan 2000. It returns the day of the week (an integer in the range 1--7) as a real.
The section 1.10.1.47 tm_yday function interprets its argument as a time, in seconds from 1 Jan 2000. It returns the day of the year (an integer in the range 1--366) as a real.
The section 1.10.1.48 tm_year function interprets its argument as a time, in seconds from 1 Jan 2000. It returns the year (an integer) as a real.
`valid(x)` may be used only in expressions as part of section 2.10.1.7 using manipulations to fits or datafile plots. See section 2.10.1.7 using.
Use of functions and complex variables for airfoils
The operators in `gnuplot` are the same as the corresponding operators in the C programming language, except that all operators accept integer, real, and complex arguments, unless otherwise noted. The ** operator (exponentiation) is supported, as in FORTRAN.
Parentheses may be used to change order of evaluation.
The following is a list of all the unary operators and their usages:
Symbol Example Explanation - -a unary minus + +a unary plus (no-operation) ~ ~a * one's complement ! !a * logical negation ! a! * factorial $ $3 * call arg/column during section 2.10.1.7 using manipulation
(*) Starred explanations indicate that the operator requires an integer argument.
Operator precedence is the same as in Fortran and C. As in those languages, parentheses may be used to change the order of operation. Thus -2**2 = -4, but (-2)**2 = 4.
The factorial operator returns a real number to allow a greater range.
The following is a list of all the binary operators and their usages:
Symbol Example Explanation ** a**b exponentiation * a*b multiplication / a/b division % a%b * modulo + a+b addition - a-b subtraction == a==b equality != a!=b inequality < a<b less than <= a<=b less than or equal to > a>b greater than >= a>=b greater than or equal to & a&b * bitwise AND ^ a^b * bitwise exclusive OR | a|b * bitwise inclusive OR && a&&b * logical AND || a||b * logical OR
(*) Starred explanations indicate that the operator requires integer arguments.
Logical AND (&&) and OR (||) short-circuit the way they do in C. That is, the second `&&` operand is not evaluated if the first is false; the second `||` operand is not evaluated if the first is true.
There is a single ternary operator:
Symbol Example Explanation ?: a?b:c ternary operation
The ternary operator behaves as it does in C. The first argument (a), which must be an integer, is evaluated. If it is true (non-zero), the second argument (b) is evaluated and returned; otherwise the third argument (c) is evaluated and returned.
The ternary operator is very useful both in constructing piecewise functions and in plotting points only when certain conditions are met.
Examples:
Plot a function that is to equal sin(x) for 0 <= x < 1, 1/x for 1 <= x < 2, and undefined elsewhere:
f(x) = 0<=x && x<1 ? sin(x) : 1<=x && x<2 ? 1/x : 1/0 plot f(x)
Note that `gnuplot` quietly ignores undefined values, so the final branch of the function (1/0) will produce no plottable points. Note also that f(x) will be plotted as a continuous function across the discontinuity if a line style is used. To plot it discontinuously, create separate functions for the two pieces. (Parametric functions are also useful for this purpose.)
For data in a file, plot the average of the data in columns 2 and 3 against the datum in column 1, but only if the datum in column 4 is non-negative:
plot 'file' using 1:( $4<0 ? 1/0 : ($2+$3)/2 )
Please see section 2.10.1.7 using for an explanation of the section 2.10.1.7 using syntax.
New user-defined variables and functions of one through five variables may be declared and used anywhere, including on the section 2.10 plot command itself.
User-defined function syntax:
<func-name>( <dummy1> {,<dummy2>} ... {,<dummy5>} ) = <expression>
where <expression> is defined in terms of <dummy1> through <dummy5>.
User-defined variable syntax:
<variable-name> = <constant-expression>
Examples:
w = 2 q = floor(tan(pi/2 - 0.1)) f(x) = sin(w*x) sinc(x) = sin(pi*x)/(pi*x) delta(t) = (t == 0) ramp(t) = (t > 0) ? t : 0 min(a,b) = (a < b) ? a : b comb(n,k) = n!/(k!*(n-k)!) len3d(x,y,z) = sqrt(x*x+y*y+z*z) plot f(x) = sin(x*a), a = 0.2, f(x), a = 0.4, f(x)
Note that the variable `pi` is already defined. But it is in no way magic; you may redefine it to be whatever you like.
Valid names are the same as in most programming languages: they must begin with a letter, but subsequent characters may be letters, digits, "$", or "_". Note, however, that the `fit` mechanism uses several variables with names that begin "FIT_". It is safest to avoid using such names. "FIT_LIMIT", however, is one that you may wish to redefine. See the documentation on `fit` for details.
See section 2.18.18 functions, section 2.18.59 variables, and `fit`.
Throughout this document an attempt has been made to maintain consistency of nomenclature. This cannot be wholly successful because as `gnuplot` has evolved over time, certain command and keyword names have been adopted that preclude such perfection. This section contains explanations of the way some of these terms are used.
A "page" or "screen" is the entire area addressable by `gnuplot`. On a monitor, it is the full screen; on a plotter, it is a single sheet of paper.
A screen may contain one or more "plots". A plot is defined by an abscissa and an ordinate, although these need not actually appear on it, as well as the margins and any text written therein.
A plot contains one "graph". A graph is defined by an abscissa and an ordinate, although these need not actually appear on it.
A graph may contain one or more "lines". A line is a single function or data set. "Line" is also a plotting style. The word will also be used in sense "a line of text". Presumably the context will remove any ambiguity.
The lines on a graph may have individual names. These may be listed together with a sample of the plotting style used to represent them in the "key", sometimes also called the "legend".
The word "title" occurs with multiple meanings in `gnuplot`. In this document, it will always be preceded by the adjective "plot", "line", or "key" to differentiate among them.
A graph may have up to four labelled axes. Various commands have the name of an axis built into their names, such as section 2.18.72 xlabel. Other commands have one or more axis names as options, such as `set logscale xy`. The names of the four axes for these usages are "x" for the axis along the bottom border of the plot, "y" for the left border, "x2" for the top border, and "y2" for the right border. "z" also occurs in commands used with 3-d plotting.
When discussing data files, the term "record" will be resurrected and used to denote a single line of text in the file, that is, the characters between newline or end-of-record characters. A "point" is the datum extracted from a single record. A "datablock" is a set of points from consecutive records, delimited by blank records. A line, when referred to in the context of a data file, is a subset of a datablock.
There are three `gnuplot` commands which actually create a plot: section 2.10 plot, `splot` and section 2.14 replot. section 2.10 plot generates 2-d plots, `splot` generates 3-d plots (actually 2-d projections, of course), and section 2.14 replot appends its arguments to the previous section 2.10 plot or `splot` and executes the modified command.
Much of the general information about plotting can be found in the discussion of section 2.10 plot; information specific to 3-d can be found in the `splot` section.
section 2.10 plot operates in either rectangular or polar coordinates -- see `set polar` for details of the latter. `splot` operates only in rectangular coordinates, but the section 2.18.28 mapping command allows for a few other coordinate systems to be treated. In addition, the section 2.10.1.7 using option allows both section 2.10 plot and `splot` to treat almost any coordinate system you'd care to define.
`splot` can plot surfaces and contours in addition to points and/or lines. In addition to `splot`, see section 2.18.21 isosamples for information about defining the grid for a 3-d function; `splot datafile` for information about the requisite file structure for 3-d data values; and section 2.18.11 contour and section 2.18.10 cntrparam for information about contours.
When `gnuplot` is run, it looks for an initialization file to load. This file is called `.gnuplot` on Unix and AmigaOS systems, and `GNUPLOT.INI` on other systems. If this file is not found in the current directory, the program will look for it in the home directory (under AmigaOS, Atari(single)TOS, MS-DOS and OS/2, the environment variable `gnuplot` should contain the name of this directory). Note: if NOCWDRC is defined during the installation, `gnuplot` will not read from the current directory.
If the initialization file is found, `gnuplot` executes the commands in it. These may be any legal `gnuplot` commands, but typically they are limited to setting the terminal and defining frequently-used functions or variables.
Command-line substitution is specified by a system command enclosed in backquotes. This command is spawned and the output it produces replaces the name of the command (and backquotes) on the command line. Some implementations also support pipes; see section 2.10.1.5 special-filenames.
Newlines in the output produced by the spawned command are replaced with blanks.
Command-line substitution can be used anywhere on the `gnuplot` command line.
Example:
This will run the program `leastsq` and replace `leastsq` (including backquotes) on the command line with its output:
f(x) = `leastsq`
or, in VMS
f(x) = `run leastsq`
The general rules of syntax and punctuation in `gnuplot` are that keywords and options are order-dependent. Options and any accompanying parameters are separated by spaces whereas lists and coordinates are separated by commas. Ranges are separated by colons and enclosed in brackets [], text and file names are enclosed in quotes, and a few miscellaneous things are enclosed in parentheses. Braces {} are used for a few special purposes.
Commas are used to separate coordinates on the `set` commands section 2.18.2 arrow, section 2.18.22 key, and section 2.18.23 label; the list of variables being fitted (the list after the `via` keyword on the `fit` command); lists of discrete contours or the loop parameters which specify them on the section 2.18.10 cntrparam command; the arguments of the `set` commands section 2.18.13 dgrid3d, section 2.18.14 dummy, section 2.18.21 isosamples, section 2.18.37 offsets, section 2.18.38 origin, section 2.18.45 samples, section 2.18.46 size, `time`, and section 2.18.61 view; lists of tics or the loop parameters which specify them; the offsets for titles and axis labels; parametric functions to be used to calculate the x, y, and z coordinates on the section 2.10 plot, section 2.14 replot and `splot` commands; and the complete sets of keywords specifying individual plots (data sets or functions) on the section 2.10 plot, section 2.14 replot and `splot` commands.
Parentheses are used to delimit sets of explicit tics (as opposed to loop parameters) and to indicate computations in the section 2.10.1.7 using filter of the `fit`, section 2.10 plot, section 2.14 replot and `splot` commands.
(Parentheses and commas are also used as usual in function notation.)
Brackets are used to delimit ranges, whether they are given on `set`, section 2.10 plot or `splot` commands.
Colons are used to separate extrema in `range` specifications (whether they are given on `set`, section 2.10 plot or `splot` commands) and to separate entries in the section 2.10.1.7 using filter of the section 2.10 plot, section 2.14 replot, `splot` and `fit` commands.
Semicolons are used to separate commands given on a single command line.
Braces are used in text to be specially processed by some terminals, like `postscript`. They are also used to denote complex numbers: {3,2} = 3 + 2i.
Text may be enclosed in single- or double-quotes. Backslash processing of sequences like \n (newline) and \345 (octal character code) is performed for double-quoted strings, but not for single-quoted strings.
The justification is the same for each line of a multi-line string. Thus the center-justified string
"This is the first line of text.\nThis is the second line."
will produce
This is the first line of text. This is the second line.
but
'This is the first line of text.\nThis is the second line.'
will produce
This is the first line of text.\nThis is the second line.
Filenames may be entered with either single- or double-quotes. In this manual the command examples generally single-quote filenames and double-quote other string tokens for clarity.
At present you should not embed \n inside {} when using the enhanced option of the postscript terminal.
The EEPIC, Imagen, Uniplex, LaTeX, and TPIC drivers allow a newline to be specified by \\ in a single-quoted string or \\\\ in a double-quoted string.
Back-quotes are used to enclose system commands for substitution.
`gnuplot` supports the use of time and/or date information as input data. This feature is activated by the commands `set xdata time`, `set ydata time`, etc.
Internally all times and dates are converted to the number of seconds from the year 2000. The command section 2.18.54 timefmt defines the format for all inputs: data files, ranges, tics, label positions--in short, anything that accepts a data value must receive it in this format. Since only one input format can be in force at a given time, all time/date quantities being input at the same time must be presented in the same format. Thus if both x and y data in a file are time/date, they must be in the same format.
The conversion to and from seconds assumes Universal Time (which is the same as Greenwich Standard Time). There is no provision for changing the time zone or for daylight savings. If all your data refer to the same time zone (and are all either daylight or standard) you don't need to worry about these things. But if the absolute time is crucial for your application, you'll need to convert to UT yourself.
Commands like section 2.18.74 xrange will re-interpret the integer according to section 2.18.54 timefmt. If you change section 2.18.54 timefmt, and then `show` the quantity again, it will be displayed in the new section 2.18.54 timefmt. For that matter, if you give the deactivation command (like section 2.18.70 xdata), the quantity will be shown in its numerical form.
The command `set format` defines the format that will be used for tic labels, whether or not the specified axis is time/date.
If time/date information is to be plotted from a file, the section 2.10.1.7 using option _must_ be used on the section 2.10 plot or `splot` command. These commands simply use white space to separate columns, but white space may be embedded within the time/date string. If you use tabs as a separator, some trial-and-error may be necessary to discover how your system treats them.
The following example demonstrates time/date plotting.
Suppose the file "data" contains records like
03/21/95 10:00 6.02e23
This file can be plotted by
set xdata time set timefmt "%m/%d/%y" set xrange ["03/21/95":"03/22/95"] set format x "%m/%d" set timefmt "%m/%d/%y %H:%M" plot "data" using 1:3
which will produce xtic labels that look like "03/21".
See the descriptions of each command for more details.
This section lists the commands acceptable to `gnuplot` in alphabetical order. Printed versions of this document contain all commands; on-line versions may not be complete. Indeed, on some systems there may be no commands at all listed under this heading.
Note that in most cases unambiguous abbreviations for command names and their options are permissible, i.e., "`p f(x) w l`" instead of "`plot f(x) with lines`".
In the syntax descriptions, braces ({}) denote optional arguments and a vertical bar (|) separates mutually exclusive choices.
The section 2.1 cd command changes the working directory.
Syntax:
cd '<directory-name>'
The directory name must be enclosed in quotes.
Examples:
cd 'subdir' cd ".."
DOS users _must_ use single-quotes--backslash [\] has special significance inside double-quotes. For example,
cd "c:\newdata"
fails, but
cd 'c:\newdata'
works as expected.
The section 2.2 call command is identical to the load command with one exception: you can have up to ten additional parameters to the command (delimited according to the standard parser rules) which can be substituted into the lines read from the file. As each line is read from the section 2.2 called input file, it is scanned for the sequence `$` (dollar-sign) followed by a digit (0--9). If found, the sequence is replaced by the corresponding parameter from the section 2.2 call command line. If the parameter was specified as a string in the section 2.2 call line, it is substituted without its enclosing quotes. `$` followed by any character other than a digit will be that character. E.g. use `$$` to get a single `$`. Providing more than ten parameters on the section 2.2 call command line will cause an error. A parameter that was not provided substitutes as nothing. Files being section 2.2 called may themselves contain section 2.2 call or section 2.8 load commands.
The section 2.2 call command _must_ be the last command on a multi-command line.
Syntax:
call "<input-file>" <parameter-0> <parm-1> ... <parm-9>
The name of the input file must be enclosed in quotes, and it is recommended that parameters are similarly enclosed in quotes (future versions of gnuplot may treat quoted and unquoted arguments differently).
Example:
If the file 'calltest.gp' contains the line:
print "p0=$0 p1=$1 p2=$2 p3=$3 p4=$4 p5=$5 p6=$6 p7=x$7x"
entering the command:
call 'calltest.gp' "abcd" 1.2 + "'quoted'" -- "$2"
will display:
p0=abcd p1=1.2 p2=+ p3='quoted' p4=- p5=- p6=$2 p7=xx
NOTE: there is a clash in syntax with the datafile section 2.10.1.7 using callback operator. Use `$$n` or `column(n)` to access column n from a datafile inside a section 2.2 called datafile plot.
The section 2.3 clear command erases the current screen or output device as specified by section 2.18.39 output. This usually generates a formfeed on hardcopy devices. Use section 2.18.49 terminal to set the device type.
For some terminals section 2.3 clear erases only the portion of the plotting surface defined by section 2.18.46 size, so for these it can be used in conjunction with section 2.18.31 multiplot to create an inset.
Example:
set multiplot plot sin(x) set origin 0.5,0.5 set size 0.4,0.4 clear plot cos(x) set nomultiplot
Please see section 2.18.31 multiplot, section 2.18.46 size, and section 2.18.38 origin for details of these commands.
The commands section 2.4 exit and section 2.13 quit and the END-OF-FILE character will exit the current `gnuplot` command file and section 2.8 load the next one. See "help batch/interactive" for more details.
Each of these commands will clear the output device (as does the section 2.3 clear command) before exiting.
The `fit` command can fit a user-defined function to a set of data points (x,y) or (x,y,z), using an implementation of the nonlinear least-squares (NLLS) Marquardt-Levenberg algorithm. Any user-defined variable occurring in the function body may serve as a fit parameter, but the return type of the function must be real.
Syntax:
fit {[xrange] {[yrange]}} <function> '<datafile>' {datafile-modifiers} via '<parameter file>' | <var1>{,<var2>,...}
Ranges may be specified to temporarily limit the data which is to be fitted; any out-of-range data points are ignored. The syntax is
[{dummy_variable=}{<min>}{:<max>}],
analogous to section 2.10 plot; see section 2.10.4 ranges.
<function> is any valid `gnuplot` expression, although it is usual to use a previously user-defined function of the form f(x) or f(x,y).
<datafile> is treated as in the section 2.10 plot command. All the `plot datafile` modifiers (section 2.10.1.7 using, section 2.10.1.1 every,...) except section 2.10.1.4 smooth are applicable to `fit`. See `plot datafile`.
The default data formats for fitting functions with a single independent variable, y=f(x), are {x:}y or x:y:s; those formats can be changed with the datafile section 2.10.1.7 using qualifier. The third item, (a column number or an expression), if present, is interpreted as the standard deviation of the corresponding y value and is used to compute a weight for the datum, 1/s**2. Otherwise, all data points are weighted equally, with a weight of one.
To fit a function with two independent variables, z=f(x,y), the required format is section 2.10.1.7 using with four items, x:y:z:s. The complete format must be given--no default columns are assumed for a missing token. Weights for each data point are evaluated from 's' as above. If error estimates are not available, a constant value can be specified as a constant expression (see section 2.10.1.7 using), e.g., `using 1:2:3:(1)`.
Multiple datasets may be simultaneously fit with functions of one independent variable by making y a 'pseudo-variable', e.g., the dataline number, and fitting as two independent variables. See `fit multibranch`.
The `via` qualifier specifies which parameters are to be adjusted, either directly, or by referencing a parameter file.
Examples:
f(x) = a*x**2 + b*x + c g(x,y) = a*x**2 + b*y**2 + c*x*y FIT_LIMIT = 1e-6 fit f(x) 'measured.dat' via 'start.par' fit f(x) 'measured.dat' using 3:($7-5) via 'start.par' fit f(x) './data/trash.dat' using 1:2:3 via a, b, c fit g(x,y) 'surface.dat' using 1:2:3:(1) via a, b, c
After each iteration step, detailed information about the current state of the fit is written to the display. The same information about the initial and final states is written to a log file, "fit.log". This file is always appended to, so as to not lose any previous fit history; it should be deleted or renamed as desired.
The fit may be interrupted by pressing Ctrl-C (any key but Ctrl-C under MSDOS and Atari Multitasking Systems). After the current iteration completes, you have the option to (1) stop the fit and accept the current parameter values, (2) continue the fit, (3) execute a `gnuplot` command as specified by the environment variable FIT_SCRIPT. The default for FIT_SCRIPT is section 2.14 replot, so if you had previously plotted both the data and the fitting function in one graph, you can display the current state of the fit.
Once `fit` has finished, the section 2.22 update command may be used to store final values in a file for subsequent use as a parameter file. See section 2.22 update for details.
There are two ways that `via` can specify the parameters to be adjusted, either directly on the command line or indirectly, by referencing a parameter file. The two use different means to set initial values.
Adjustable parameters can be specified by a comma-separated list of variable names after the `via` keyword. Any variable that is not already defined is is created with an initial value of 1.0. However, the fit is more likely to converge rapidly if the variables have been previously declared with more appropriate starting values.
In a parameter file, each parameter to be varied and a corresponding initial value are specified, one per line, in the form
varname = value
Comments, marked by '#', and blank lines are permissible. The special form
varname = value # FIXED
means that the variable is treated as a 'fixed parameter', initialized by the parameter file, but not adjusted by `fit`. For clarity, it may be useful to designate variables as fixed parameters so that their values are reported by `fit`. The keyword `# FIXED` has to appear in exactly this form.
`fit` is used to find a set of parameters that 'best' fits your data to your user-defined function. The fit is judged on the basis of the the sum of the squared differences or 'residuals' (SSR) between the input data points and the function values, evaluated at the same places. This quantity is often called 'chisquare' (i.e., the Greek letter chi, to the power of 2). The algorithm attempts to minimize SSR, or more precisely, WSSR, as the residuals are 'weighted' by the input data errors (or 1.0) before being squared; see `fit error_estimates` for details.
That's why it is called 'least-squares fitting'. Let's look at an example to see what is meant by 'non-linear', but first we had better go over some terms. Here it is convenient to use z as the dependent variable for user-defined functions of either one independent variable, z=f(x), or two independent variables, z=f(x,y). A parameter is a user-defined variable that `fit` will adjust, i.e., an unknown quantity in the function declaration. Linearity/non-linearity refers to the relationship of the dependent variable, z, to the parameters which `fit` is adjusting, not of z to the independent variables, x and/or y. (To be technical, the second {and higher} derivatives of the fitting function with respect to the parameters are zero for a linear least-squares problem).
For linear least-squares (LLS), the user-defined function will be a sum of simple functions, not involving any parameters, each multiplied by one parameter. NLLS handles more complicated functions in which parameters can be used in a large number of ways. An example that illustrates the difference between linear and nonlinear least-squares is the Fourier series. One member may be written as
z=a*sin(c*x) + b*cos(c*x).
If a and b are the unknown parameters and c is constant, then estimating values of the parameters is a linear least-squares problem. However, if c is an unknown parameter, the problem is nonlinear.
In the linear case, parameter values can be determined by comparatively simple linear algebra, in one direct step. However LLS is a special case which is also solved along with more general NLLS problems by the iterative procedure that `gnuplot` uses. `fit` attempts to find the minimum by doing a search. Each step (iteration) calculates WSSR with a new set of parameter values. The Marquardt-Levenberg algorithm selects the parameter values for the next iteration. The process continues until a preset criterium is met, either (1) the fit has "converged" (the relative change in WSSR is less than FIT_LIMIT), or (2) it reaches a preset iteration count limit, FIT_MAXITER (see section 2.18.59 variables). The fit may also be interrupted and subsequently halted from the keyboard (see `fit`).
Often the function to be fitted will be based on a model (or theory) that attempts to describe or predict the behaviour of the data. Then `fit` can be used to find values for the free parameters of the model, to determine how well the data fits the model, and to estimate an error range for each parameter. See `fit error_estimates`.
Alternatively, in curve-fitting, functions are selected independent of a model (on the basis of experience as to which are likely to describe the trend of the data with the desired resolution and a minimum number of parameters*functions.) The `fit` solution then provides an analytic representation of the curve.
However, if all you really want is a smooth curve through your data points, the section 2.10.1.4 smooth option to section 2.10 plot may be what you've been looking for rather than `fit`.
In `fit`, the term "error" is used in two different contexts, data error estimates and parameter error estimates.
Data error estimates are used to calculate the relative weight of each data point when determining the weighted sum of squared residuals, WSSR or chisquare. They can affect the parameter estimates, since they determine how much influence the deviation of each data point from the fitted function has on the final values. Some of the `fit` output information, including the parameter error estimates, is more meaningful if accurate data error estimates have been provided.
The 'statistical overview' describes some of the `fit` output and gives some background for the 'practical guidelines'.
The theory of non-linear least-squares (NLLS) is generally described in terms of a normal distribution of errors, that is, the input data is assumed to be a sample from a population having a given mean and a Gaussian (normal) distribution about the mean with a given standard deviation. For a sample of sufficiently large size, and knowing the population standard deviation, one can use the statistics of the chisquare distribution to describe a "goodness of fit" by looking at the variable often called "chisquare". Here, it is sufficient to say that a reduced chisquare (chisquare/degrees of freedom, where degrees of freedom is the number of datapoints less the number of parameters being fitted) of 1.0 is an indication that the weighted sum of squared deviations between the fitted function and the data points is the same as that expected for a random sample from a population characterized by the function with the current value of the parameters and the given standard deviations.
If the standard deviation for the population is not constant, as in counting statistics where variance = counts, then each point should be individually weighted when comparing the observed sum of deviations and the expected sum of deviations.
At the conclusion `fit` reports 'stdfit', the standard deviation of the fit, which is the rms of the residuals, and the variance of the residuals, also called 'reduced chisquare' when the data points are weighted. The number of degrees of freedom (the number of data points minus the number of fitted parameters) is used in these estimates because the parameters used in calculating the residuals of the datapoints were obtained from the same data.
To estimate confidence levels for the parameters, one can use the minimum chisquare obtained from the fit and chisquare statistics to determine the value of chisquare corresponding to the desired confidence level, but considerably more calculation is required to determine the combinations of parameters which produce such values.
Rather than determine confidence intervals, `fit` reports parameter error estimates which are readily obtained from the variance-covariance matrix after the final iteration. By convention, these estimates are called "standard errors" or "asymptotic standard errors", since they are calculated in the same way as the standard errors (standard deviation of each parameter) of a linear least-squares problem, even though the statistical conditions for designating the quantity calculated to be a standard deviation are not generally valid for the NLLS problem. The asymptotic standard errors are generally over-optimistic and should not be used for determining confidence levels, but are useful for qualitative purposes.
The final solution also produces a correlation matrix, which gives an indication of the correlation of parameters in the region of the solution; if one parameter is changed, increasing chisquare, does changing another compensate? The main diagonal elements, autocorrelation, are all 1; if all parameters were independent, all other elements would be nearly 0. Two variables which completely compensate each other would have an off-diagonal element of unit magnitude, with a sign depending on whether the relation is proportional or inversely proportional. The smaller the magnitudes of the off-diagonal elements, the closer the estimates of the standard deviation of each parameter would be to the asymptotic standard error.
If you have a basis for assigning weights to each data point, doing so lets you make use of additional knowledge about your measurements, e.g., take into account that some points may be more reliable than others. That may affect the final values of the parameters.
Weighting the data provides a basis for interpreting the additional `fit` output after the last iteration. Even if you weight each point equally, estimating an average standard deviation rather than using a weight of 1 makes WSSR a dimensionless variable, as chisquare is by definition.
Each fit iteration will display information which can be used to evaluate the progress of the fit. (An '*' indicates that it did not find a smaller WSSR and is trying again.) The 'sum of squares of residuals', also called 'chisquare', is the WSSR between the data and your fitted function; `fit` has minimized that. At this stage, with weighted data, chisquare is expected to approach the number of degrees of freedom (data points minus parameters). The WSSR can be used to calculate the reduced chisquare (WSSR/ndf) or stdfit, the standard deviation of the fit, sqrt(WSSR/ndf). Both of these are reported for the final WSSR.
If the data are unweighted, stdfit is the rms value of the deviation of the data from the fitted function, in user units.
If you supplied valid data errors, the number of data points is large enough, and the model is correct, the reduced chisquare should be about unity. (For details, look up the 'chi-squared distribution' in your favourite statistics reference.) If so, there are additional tests, beyond the scope of this overview, for determining how well the model fits the data.
A reduced chisquare much larger than 1.0 may be due to incorrect data error estimates, data errors not normally distributed, systematic measurement errors, 'outliers', or an incorrect model function. A plot of the residuals, e.g., `plot 'datafile' using 1:($2-f($1))`, may help to show any systematic trends. Plotting both the data points and the function may help to suggest another model.
Similarly, a reduced chisquare less than 1.0 indicates WSSR is less than that expected for a random sample from the function with normally distributed errors. The data error estimates may be too large, the statistical assumptions may not be justified, or the model function may be too general, fitting fluctuations in a particular sample in addition to the underlying trends. In the latter case, a simpler function may be more appropriate.
You'll have to get used to both `fit` and the kind of problems you apply it to before you can relate the standard errors to some more practical estimates of parameter uncertainties or evaluate the significance of the correlation matrix.
Note that `fit`, in common with most NLLS implementations, minimizes the weighted sum of squared distances (y-f(x))**2. It does not provide any means to account for "errors" in the values of x, only in y. Also, any "outliers" (data points outside the normal distribution of the model) will have an exaggerated effect on the solution.
There are a number of `gnuplot` variables that can be defined to affect `fit`. Those which can be defined once `gnuplot` is running are listed under 'control_variables' while those defined before starting `gnuplot` are listed under 'environment_variables'.
The default epsilon limit (1e-5) may be changed by declaring a value for
FIT_LIMIT
When the sum of squared residuals changes between two iteration steps by a factor less than this number (epsilon), the fit is considered to have 'converged'.
The maximum number of iterations may be limited by declaring a value for
FIT_MAXITER
A value of 0 (or not defining it at all) means that there is no limit.
If you need even more control about the algorithm, and know the Marquardt-Levenberg algorithm well, there are some more variables to influence it. The startup value of `lambda` is normally calculated automatically from the ML-matrix, but if you want to, you may provide your own one with
FIT_START_LAMBDA
Specifying FIT_START_LAMBDA as zero or less will re-enable the automatic selection. The variable
FIT_LAMBDA_FACTOR
gives the factor by which `lambda` is increased or decreased whenever the chi-squared target function increased or decreased significantly. Setting FIT_LAMBDA_FACTOR to zero re-enables the default factor of 10.0.
Oher variables with the FIT_ prefix may be added to `fit`, so it is safer not to use that prefix for user-defined variables.
The variables FIT_SKIP and FIT_INDEX were used by earlier releases of `gnuplot` with a 'fit' patch called `gnufit` and are no longer available. The datafile section 2.10.1.1 every modifier provides the functionality of FIT_SKIP. FIT_INDEX was used for multi-branch fitting, but multi-branch fitting of one independent variable is now done as a pseudo-3D fit in which the second independent variable and section 2.10.1.7 using are used to specify the branch. See section 2.5.5 multi-branch.
The environment variables must be defined before `gnuplot` is executed; how to do so depends on your operating system.
FIT_LOG
changes the name (and/or path) of the file to which the fit log will be written from the default of "fit.log" in the working directory.
FIT_SCRIPT
specifies a command that may be executed after an user interrupt. The default is section 2.14 replot, but a section 2.10 plot or section 2.8 load command may be useful to display a plot customized to highlight the progress of the fit.
In multi-branch fitting, multiple data sets can be simultaneously fit with functions of one independent variable having common parameters by minimizing the total WSSR. The function and parameters (branch) for each data set are selected by using a 'pseudo-variable', e.g., either the dataline number (a 'column' index of -1) or the datafile index (-2), as the second independent variable.
Example: Given two exponential decays of the form, z=f(x), each describing a different data set but having a common decay time, estimate the values of the parameters. If the datafile has the format x:z:s, then
f(x,y) = (y==0) ? a*exp(-x/tau) : b*exp(-x/tau) fit f(x,y) 'datafile' using 1:-1:2:3 via a, b, tau
For a more complicated example, see the file "hexa.fnc" used by the "fit.dem" demo.
Appropriate weighting may be required since unit weights may cause one branch to predominate if there is a difference in the scale of the dependent variable. Fitting each branch separately, using the multi-branch solution as initial values, may give an indication as to the relative effect of each branch on the joint solution.
Nonlinear fitting is not guaranteed to converge to the global optimum (the solution with the smallest sum of squared residuals, SSR), and can get stuck at a local minimum. The routine has no way to determine that; it is up to you to judge whether this has happened.
`fit` may, and often will get "lost" if started far from a solution, where SSR is large and changing slowly as the parameters are varied, or it may reach a numerically unstable region (e.g., too large a number causing a floating point overflow) which results in an "undefined value" message or `gnuplot` halting.
To improve the chances of finding the global optimum, you should set the starting values at least roughly in the vicinity of the solution, e.g., within an order of magnitude, if possible. The closer your starting values are to the solution, the less chance of stopping at another minimum. One way to find starting values is to plot data and the fitting function on the same graph and change parameter values and section 2.14 replot until reasonable similarity is reached. The same plot is also useful to check whether the fit stopped at a minimum with a poor fit.
Of course, a reasonably good fit is not proof there is not a "better" fit (in either a statistical sense, characterized by an improved goodness-of-fit criterion, or a physical sense, with a solution more consistent with the model.) Depending on the problem, it may be desirable to `fit` with various sets of starting values, covering a reasonable range for each parameter.
Here are some tips to keep in mind to get the most out of `fit`. They're not very organized, so you'll have to read them several times until their essence has sunk in.
The two forms of the `via` argument to `fit` serve two largely distinct purposes. The `via "file"` form is best used for (possibly unattended) batch operation, where you just supply the startup values in a file and can later use section 2.22 update to copy the results back into another (or the same) parameter file.
The `via var1, var2, ...` form is best used interactively, where the command history mechanism may be used to edit the list of parameters to be fitted or to supply new startup values for the next try. This is particularly useful for hard problems, where a direct fit to all parameters at once won't work without good starting values. To find such, you can iterate several times, fitting only some of the parameters, until the values are close enough to the goal that the final fit to all parameters at once will work.
Make sure that there is no mutual dependency among parameters of the function you are fitting. For example, don't try to fit a*exp(x+b), because a*exp(x+b)=a*exp(b)*exp(x). Instead, fit either a*exp(x) or exp(x+b).
A technical issue: the parameters must not be too different in magnitude. The larger the ratio of the largest and the smallest absolute parameter values, the slower the fit will converge. If the ratio is close to or above the inverse of the machine floating point precision, it may take next to forever to converge, or refuse to converge at all. You will have to adapt your function to avoid this, e.g., replace 'parameter' by '1e9*parameter' in the function definition, and divide the starting value by 1e9.
If you can write your function as a linear combination of simple functions weighted by the parameters to be fitted, by all means do so. That helps a lot, because the problem is no longer nonlinear and should converge with only a small number of iterations, perhaps just one.
Some prescriptions for analysing data, given in practical experimentation courses, may have you first fit some functions to your data, perhaps in a multi-step process of accounting for several aspects of the underlying theory one by one, and then extract the information you really wanted from the fitting parameters of those functions. With `fit`, this may often be done in one step by writing the model function directly in terms of the desired parameters. Transforming data can also quite often be avoided, though sometimes at the cost of a more difficult fit problem. If you think this contradicts the previous paragraph about simplifying the fit function, you are correct.
A "singular matrix" message indicates that this implementation of the Marquardt-Levenberg algorithm can't calculate parameter values for the next iteration. Try different starting values, writing the function in another form, or a simpler function.
Finally, a nice quote from the manual of another fitting package (fudgit), that kind of summarizes all these issues: "Nonlinear fitting is an art!"
The section 2.6 help command displays on-line help. To specify information on a particular topic use the syntax:
help {<topic>}
If <topic> is not specified, a short message is printed about `gnuplot`. After help for the requested topic is given, a menu of subtopics is given; help for a subtopic may be requested by typing its name, extending the help request. After that subtopic has been printed, the request may be extended again or you may go back one level to the previous topic. Eventually, the `gnuplot` command line will return.
If a question mark (?) is given as the topic, the list of topics currently available is printed on the screen.
The section 2.7 if command allows commands to be executed conditionally.
Syntax:
if (<condition>) <command-line>
<condition> will be evaluated. If it is true (non-zero), then the command(s) of the <command-line> will be executed. If <condition> is false (zero), then the entire <command-line> is ignored. Note that use of `;` to allow multiple commands on the same line will _not_ end the conditionalized commands.
Examples:
pi=3 if (pi!=acos(-1)) print "?Fixing pi!"; pi=acos(-1); print pi
will display:
?Fixing pi! 3.14159265358979
but
if (1==2) print "Never see this"; print "Or this either"
will not display anything.
See section 2.15 reread for an example of how section 2.7 if and section 2.15 reread can be used together to perform a loop.
The section 2.8 load command executes each line of the specified input file as if it had been typed in interactively. Files created by the section 2.17 save command can later be section 2.8 loaded. Any text file containing valid commands can be created and then executed by the section 2.8 load command. Files being section 2.8 loaded may themselves contain section 2.8 load or section 2.2 call commands. See `comment` for information about comments in commands. To section 2.8 load with arguments, see section 2.2 call.
The section 2.8 load command _must_ be the last command on a multi-command line.
Syntax:
load "<input-file>"
The name of the input file must be enclosed in quotes.
The special filename "-" may be used to section 2.8 load commands from standard input. This allows a `gnuplot` command file to accept some commands from standard input. Please see "help batch/interactive" for more details.
Examples:
load 'work.gnu' load "func.dat"
The section 2.8 load command is performed implicitly on any file names given as arguments to `gnuplot`. These are loaded in the order specified, and then `gnuplot` exits.
The section 2.9 pause command displays any text associated with the command and then waits a specified amount of time or until the carriage return is pressed. section 2.9 pause is especially useful in conjunction with section 2.8 load files.
Syntax:
pause <time> {"<string>"}
<time> may be any integer constant or expression. Choosing -1 will wait until a carriage return is hit, zero (0) won't pause at all, and a positive integer will wait the specified number of seconds. `pause 0` is synonymous with section 2.11 print.
Note: Since section 2.9 pause communicates with the operating system rather than the graphics, it may behave differently with different device drivers (depending upon how text and graphics are mixed).
Examples:
pause -1 # Wait until a carriage return is hit pause 3 # Wait three seconds pause -1 "Hit return to continue" pause 10 "Isn't this pretty? It's a cubic spline."
section 2.10 plot is the primary command for drawing plots with `gnuplot`. It creates plots of functions and data in many, many ways. section 2.10 plot is used to draw 2-d functions and data; `splot` draws 2-d projections of 3-d surfaces and data. section 2.10 plot and `splot` contain many common features; see `splot` for differences. Note specifically that `splot`'s section 2.20.1.1 binary and section 2.20.1.3 matrix options do not exist for section 2.10 plot.
Syntax:
plot {<ranges>} {<function> | {"<datafile>" {datafile-modifiers}}} {axes <axes>} {<title-spec>} {with <style>} {, {definitions,} <function> ...}
where either a <function> or the name of a data file enclosed in quotes is supplied. A function is a mathematical expression or a pair of mathematical expressions in parametric mode. The expressions may be defined completely or in part earlier in the stream of `gnuplot` commands (see `user-defined`).
It is also possible to define functions and parameters on the section 2.10 plot command itself. This is done merely by isolating them from other items with commas.
There are four possible sets of axes available; the keyword <axes> is used to select the axes for which a particular line should be scaled. `x1y1` refers to the axes on the bottom and left; `x2y2` to those on the top and right; `x1y2` to those on the bottom and right; and `x2y1` to those on the top and left. Ranges specified on the section 2.10 plot command apply only to the first set of axes (bottom left).
Examples:
plot sin(x) plot f(x) = sin(x*a), a = .2, f(x), a = .4, f(x) plot [t=1:10] [-pi:pi*2] tan(t), \ "data.1" using (tan($2)):($3/$4) smooth csplines \ axes x1y2 notitle with lines 5
Discrete data contained in a file can be displayed by specifying the name of the data file (enclosed in single or double quotes) on the section 2.10 plot command line.
Syntax:
plot '<file_name>' {index <index list>} {every <every list>} {thru <thru expression>} {using <using list>} {smooth <option>}
The modifiers section 2.10.1.3 index, section 2.10.1.1 every, section 2.10.1.6 thru, section 2.10.1.7 using, and section 2.10.1.4 smooth are discussed separately. In brief, section 2.10.1.3 index selects which data sets in a multi-data-set file are to be plotted, section 2.10.1.1 every specifies which points within a single data set are to be plotted, section 2.10.1.7 using determines how the columns within a single record are to be interpreted (section 2.10.1.6 thru is a special case of section 2.10.1.7 using), and section 2.10.1.4 smooth allows for simple interpolation and approximation. ('splot' has a similar syntax, but does not support the section 2.10.1.4 smooth and section 2.10.1.6 thru options.)
Data files should contain at least one data point per record (section 2.10.1.7 using can select one data point from the record). Records beginning with `#` (and also with `!` on VMS) will be treated as comments and ignored. Each data point represents an (x,y) pair. For section 2.10 plots with error bars (see section 2.10.2 errorbars), each data point is (x,y,ydelta), (x,y,ylow,yhigh), (x,y,xdelta), (x,y,xlow,xhigh), or (x,y,xlow,xhigh,ylow,yhigh). In all cases, the numbers on each record of a data file must be separated by white space (one or more blanks or tabs), unless a format specifier is provided by the section 2.10.1.7 using option. This white space divides each record into columns.
Data may be written in exponential format with the exponent preceded by the letter e, E, d, D, q, or Q.
Only one column (the y value) need be provided. If x is omitted, `gnuplot` provides integer values starting at 0.
In datafiles, blank records (records with no characters other than blanks and a newline and/or carriage return) are significant--pairs of blank records separate section 2.10.1.3 indexes (see section 2.10.1.3 index). Data separated by double blank records are treated as if they were in separate data files.
Single blank records designate discontinuities in a section 2.10 plot; no line will join points separated by a blank records (if they are plotted with a line style).
If autoscaling has been enabled (section 2.18.3 autoscale), the axes are automatically extended to include all datapoints, with a whole number of tic marks if tics are being drawn. This has two consequences: i) For `splot`, the corner of the surface may not coincide with the corner of the base. In this case, no vertical line is drawn. ii) When plotting data with the same x range on a dual-axis graph, the x coordinates may not coincide if the x2tics are not being drawn. This is because the x axis has been autoextended to a whole number of tics, but the x2 axis has not. The following example illustrates the problem:
reset; plot '-', '-' 1 1 19 19 e 1 1 19 19 e
The section 2.10.1.1 every keyword allows a periodic sampling of a data set to be plotted.
In the discussion a "point" is a datum defined by a single record in the file; "block" here will mean the same thing as "datablock" (see `glossary`).
Syntax:
plot 'file' every {<point_incr>} {:{<block_incr>} {:{<start_point>} {:{<start_block>} {:{<end_point>} {:<end_block>}}}}}
The data points to be plotted are selected according to a loop from <`start_point`> to <`end_point`> with increment <`point_incr`> and the blocks according to a loop from <`start_block`> to <`end_block`> with increment <`block_incr`>.
The first datum in each block is numbered '0', as is the first block in the file.
Note that records containing unplottable information are counted.
Any of the numbers can be omitted; the increments default to unity, the start values to the first point or block, and the end values to the last point or block. If section 2.10.1.1 every is not specified, all points in all lines are plotted.
Examples:
every :::3::3 # selects just the fourth block ('0' is first) every :::::9 # selects the first 10 blocks every 2:2 # selects every other point in every other block every ::5::15 # selects points 5 through 15 in each block
Simple Plot Demos, Non-parametric splot demos, and Parametric splot demos.
This example plots the data in the file "population.dat" and a theoretical curve:
pop(x) = 103*exp((1965-x)/10) plot [1960:1990] 'population.dat', pop(x)
The file "population.dat" might contain:
# Gnu population in Antarctica since 1965 1965 103 1970 55 1975 34 1980 24 1985 10
The section 2.10.1.3 index keyword allows only some of the data sets in a multi-data-set file to be plotted.
Syntax:
plot 'file' index <m>{{:<n>}:<p>}
Data sets are separated by pairs of blank records. `index <m>` selects only set <m>; `index <m>:<n>` selects sets in the range <m> to <n>; and `index <m>:<n>:<p>` selects indices <m>, <m>+<p>, <m>+2<p>, etc., but stopping at <n>. Following C indexing, the index 0 is assigned to the first data set in the file. Specifying too large an index results in an error message. If section 2.10.1.3 index is not specified, all sets are plotted as a single data set.
Example:
plot 'file' index 4:5
`gnuplot` includes a few general-purpose routines for interpolation and approximation of data; these are grouped under the section 2.10.1.4 smooth option. More sophisticated data processing may be performed by preprocessing the data externally or by using `fit` with an appropriate model.
Syntax:
smooth {unique | csplines | acsplines | bezier | sbezier}
`unique` plots the data after making them monotonic. Each of the other routines uses the data to determine the coefficients of a continuous curve between the endpoints of the data. This curve is then plotted in the same manner as a function, that is, by finding its value at uniform intervals along the abscissa (see section 2.18.45 samples) and connecting these points with straight line segments (if a line style is chosen).
If section 2.18.3 autoscale is in effect, the ranges will be computed such that the plotted curve lies within the borders of the graph.
If too few points are available to allow the selected option to be applied, an error message is produced. The minimum number is one for `unique`, four for `acsplines`, and three for the others.
The section 2.10.1.4 smooth options have no effect on function plots.
-- ACSPLINES ---
The `acsplines` option approximates the data with a "natural smoothing spline". After the data are made monotonic in x (see `smooth unique`), a curve is piecewise constructed from segments of cubic polynomials whose coefficients are found by the weighting the data points; the weights are taken from the third column in the data file. That default can be modified by the third entry in the section 2.10.1.7 using list, e.g.,
plot 'data-file' using 1:2:(1.0) smooth acsplines
Qualitatively, the absolute magnitude of the weights determines the number of segments used to construct the curve. If the weights are large, the effect of each datum is large and the curve approaches that produced by connecting consecutive points with natural cubic splines. If the weights are small, the curve is composed of fewer segments and thus is smoother; the limiting case is the single segment produced by a weighted linear least squares fit to all the data. The smoothing weight can be expressed in terms of errors as a statistical weight for a point divided by a "smoothing factor" for the curve so that (standard) errors in the file can be used as smoothing weights.
Example:
sw(x,S)=1/(x*x*S) plot 'data_file' using 1:2:(sw($3,100)) smooth acsplines
-- BEZIER ---
The `bezier` option approximates the data with a Bezier curve of degree n (the number of data points) that connects the endpoints.
-- CSPLINES ---
The `csplines` option connects consecutive points by natural cubic splines after rendering the data monotonic (see `smooth unique`).
-- SBEZIER ---
The `sbezier` option first renders the data monotonic (`unique`) and then applies the `bezier` algorithm.
-- UNIQUE ---
The `unique` option makes the data monotonic in x; points with the same x-value are replaced by a single point having the average y-value. The resulting points are then connected by straight line segments. See demos.
A special filename of `'-'` specifies that the data are inline; i.e., they follow the command. Only the data follow the command; section 2.10 plot options like filters, titles, and line styles remain on the 'plot' command line. This is similar to << in unix shell script, and $DECK in VMS DCL. The data are entered as though they are being read from a file, one data point per record. The letter "e" at the start of the first column terminates data entry. The section 2.10.1.7 using option can be applied to these data--using it to filter them through a function might make sense, but selecting columns probably doesn't!
`'-'` is intended for situations where it is useful to have data and commands together, e.g., when `gnuplot` is run as a sub-process of some front-end application. Some of the demos, for example, might use this feature. While section 2.10 plot options such as section 2.10.1.3 index and section 2.10.1.1 every are recognized, their use forces you to enter data that won't be used. For example, while
plot '-' index 0, '-' index 1 2 4 6
10 12 14 e 2 4 6
10 12 14 e
does indeed work,
plot '-', '-' 2 4 6 e 10 12 14 e<