WO1992006425A1

WO1992006425A1 - Multi-dimensional graphing in two-dimensional space

Info

Publication number: WO1992006425A1
Application number: PCT/US1991/007095
Authority: WO
Inventors: Ted W. Mihalisin; John Timlin; Edward T. Gawlinski; John W. Schwegler
Original assignee: Temple University
Priority date: 1990-09-28
Filing date: 1991-09-27
Publication date: 1992-04-16
Also published as: AU8644691A; EP0550617A1; US5228119A; CA2092570A1; EP0550617A4; JPH06507261A; USRE36840E

Abstract

A system uses a computer to graph multivariate or multidimensional data or functions, wherein the independent variable values (10) form an n-dimensional lattice or can be mapped to an n-dimensional lattice via binning or interpolation, on the two-dimensional surface of a monitor or output device in two ways (13). The first type of multi-dimensional graph represents the data or function hierarchically using hierarchical cells onto which are plotted hierarchical symbols. The cells correspond to the zero-, one-, two-, three-, four-dimensional etc. subspaces of the independent variables while the symbols represent the behavior of the dependent variable over the corresponding subspaces. The second type of multi-dimensional graph displays all possible paths between contiguous points in the n-dimensional space and symbols whose positions, sizes and/or other attributes can represent the values of the dependent variables.

Description

MULTI-DIMENSIONAL GRAPHING

IN TWO-DIMENSIONAL SPACE

This application is a continuation-in-part of pending U.S. Patent Application Serial No. 07/608,337, filed November 2, 1990, which is a continuation-in-part of U.S. Patent Application Serial No. 07/589,820, filed September 28, 1990.

Field of the Invention

This invention related to graphing of data or mathematical functions which have two or more independent variables and one dependent variable.

Background of the Invention

Graphs have long served the purpose of allowing visual perception and interpretation of data sets and functions. Typically, graphing involves plotting in two dimensions along an X and a Y-axis. This involves the plotting of a Y "independent" variable against an X

"dependent" variable.

There are other systems and methods for

visualizing 3-D data. Such techniques include color maps, contours, wire meshes, as well as numerous other surface rendering techniques. All too often, 3-D or multi-dimensional data sets are viewed in two dimensions in the form of X,Y plots, and then repeated over various combinations until all variables are completed. Another graphing technique involves the maintenance of variables as parameters in order to produce a two dimensional X,Y plot.

Still another method of multi-dimensional graphing is referred to as a graph "matrix." This consists of plotting all points in the multi-dimensional space in terms of their projections onto all possible planes. This technique proves to be quite useful in analyzing randomly sampled data (as opposed to lattice or grid-like data), especially in statistical investigations in which a clear identification of the dependent and independent variables may not be possible. Since it is the projection of all data points onto the various planes that is shown, a variety of data "labeling" and

"brushing" tools have been developed in order to identify corresponding points for each of the graphs.

These "matrix" graphs do not provide an easy and intuitive means of recognizing the mathematical form that one should use to fit multi-dimensional data. The primary reason for this shortcoming is that the matrix graph technique displays projections onto a particular two-dimensional subspace rather than all possible

"parallel" planar slices through this space

(corresponding to all possible values of the remaining variables). Summary of the Invention

This invention uses a digital computer for graphing multi-dimensional data sets or functions on the two-dimensional space of a computer output or display device. The invention requires the values of the independent variables of the data sets or functions to form an n-dimensional lattice of points or that they can be mapped to an n-dimensional lattice via binning or interpolation. The collection of points in the n- dimensional space can be viewed as a collection of points or as a collection of parallel lines of points or as a collection of parallel planes of points etc. until finally as a collection of parallel (n-1)-dimensional subspaces. These subspaces are nested hierarchically with points being of the lowest dimension namely zero nested within lines of points of dimension one which are nested within planes of points of dimension 2 etc. until finally one has an (n-1)-dimensional subspace of points nested within the entire n-dimensional lattice. One aspect of the invention uses a digital computer to partition its output device or a portion of its output device into a hierarchy of two-dimensional cells which may be arranged horizontally or vertically or in both the horizontal and vertical directions. Similarly, the computer uses one or more rules selected by the user from a library of appropriate rules that allow it to characterize the behavior of the dependent variable or variables over each subspace corresponding to a cell by one or more numbers which are then used to determine the size and/or shape of a one- or two-dimensional graphic symbol or symbols selected from the library of

appropriate symbols.

A further embodiment of the invention is related to a different but complimentary way of viewing the n-dimensional lattice of independent variables. In this view, the lattice is not represented as a collection of nested subspaces with corresponding hierarchical cells and symbols, but rather as a very large collection of possible equal-step paths between the two extremes namely the point where all independent variables have their minimum values and the point where all independent variables have their maximum values. A computer displays on an output device a symbolic representation of each and every possible path between the extremes as well to graphically represent the value of the dependent variable at each point along each path.

In addition, both embodiments of the invention contain a variety of tools which allow a computer to display a wide variety of subsets of cells and symbols or of paths and points. Finally, the invention can be used to toggle between its hierarchical subspace and non- hierarchical path aspects for the same data set or function.

Brief Description of the Figures

The file of this patent contains at least one drawing executed in color. Copies of this patent with color drawings will be provided by the Patent and Trademark Office upon request and payment of necessary fee.

Fig. 1 shows a flow chart of the method of the present invention. Fig. 2 shows an illustration of an example application of the present invention.

Figs. 3A and 3B show illustrations of example applications of the present invention.

Fig. 4 shows the program structure of the present invention.

Fig. 5 shows the main event loop of the present invention.

Fig. 6 shows a flow chart of the Zoom In tool.

Fig. 7 shows a flow chart of the Zoom Out tool. Fig. 8 shows a flow chart of the Animate tool.

Fig. 9 shows a flow chart of the Expander tool.

Fig. .10 shows a flow chart of the General Zoom tool.

Fig. 11 shows a flow chart of the Decimate tool.

Fig. 12 shows a flow chart of the Permute tool, Fig. 13 shows a flow chart of the Cloning tool. Fig. 14 shows an embodiment of the invention. Fig. 15 shows an additional embodiment of the invention.

Fig. 16 shows an additional embodiment of the invention. Fig. 17 shows an additional embodiment of the invention.

Fig. 18 shows an additional embodiment of the invention.

Fig. 19 shows an additional embodiment of the invention.

Fig. 20 shows an illustration of example applications of the present invention.

Fig. 21 shows an illustration of example applications of the present invention. Fig. 22 shows an illustration of example applications of the present invention.

Fig. 23 shows an illustration of an example application of the present invention.

Fig. 24 shows an illustration of an example application of the present invention.

Fig. 25 shows an illustration of an example application of the present invention.

Fig. 26 shows an illustration of an example application of the present invention. Fig. 27 shows a one hierarchical axis cell arrangement and a two hierarchical axis cell arrangement.

Fig. 28 shows examples of status indicators.

Fig. 29 shows examples of symbols. Fig. 30 shows a flowchart of an embodiment of the present invention.

Fig. 31 shows flowcharts for the draw suppression tool, symbol color tool and manual scaling tool. Fig. 32 shows flowcharts for the grid line tcol, status indicator display tool, black and white/ color tool, midline display tool, symbol outline tool and rendering direction tool.

Fig. 33 shows flowcharts for the global symbol tool, independent/dependent variable tool, hardcopy tool and interrogate tool.

Fig. 34 shows flowfharts for the cell/symbol suppression tool, draw attributes tool and symbol

representation tool. Fig. 35 shows flowcharts for the cell transformation tool and symbol transformation tool.

Figs. 36A and 36B show examples of the display tool.

Figs. 36C and 36D show examples of

transformations of symbols. Figs. 37A and 37B show how the display tool works in a two hierarchical axes case.

Fig. 37C shows a hierarchical lattice.

Fig. 38A shows four one-dimensional status indicators.

Fig. 38B shows two two-dimensional status indicators.

Fig. 38C shows a single four-dimensional status indicator. Fig. 39 shows three possible ways of re-binning a variable.

Fig. 40 shows example status indicators.

Fig. 41A shows an example one hierarchical axis graph. Fig. 41B shows an example two hierarchical graph.

Fig. 42A shows an example one hierarchical axis min/max graph.

Fig. 42B shows an example all paths display of Fig. 42A.

Fig. 43 shows another example all paths display.

Fig. 44 shows another example all paths display. Fig. 45 shows an example schematic

representation of an overview array tool display.

Fig. 46 shows an example of an array when two variables are constrained to bin sets. Fig. 47 shows an example display of an array formed of a collection of graphs.

Detailed Description of the Invention

The present invention pertains to a method for plotting scalar fields on an N-dimensional lattice. It is useful, among other things, for a variety of data visualization tasks such as the location of maxima, minima, saddle points and other features. It is also useful for visually fitting multi-variate data and for making the visual determination of dominant and weak or irrelevant variables.

In one embodiment of the invention, each independent variable is sampled in a regular grid or lattice- like fashion (spaced in equal increments) . The number and spacing of values may differ for each

variable, however, in this embodiment no missing values are allowed. Thus, the N independent variable values form a hyper-rectangular lattice in the N-dimensional space within hyper-rectangular parallelipiped domain.

Since the definition of a function is a locus of points, the present invention pertains equally to plotting functions or plotting data values.

In Fig. 1, there is shown a flowchart of the method of the present invention. In block 10, the independent variable values are read into the computer. There are numerous ways in which data values can be entered into a computer, such as through a data file or a real time solution of an equation. Next, in block 11, the independent variables are ranked. This invention plots multi-dimensional variables in two-dimensional space based on a

hierarchical ranking of variables, and the resulting rectangles which are plotted thereupon. In order to do so, it is necessary for the operator or the computer system, if it is configured as such, to rank the

variables from fastest to slowest-running variable. This can be a completely arbitrary ranking, and in fact it is often useful to view the multi-dimensional graphs in different combinations of rankings of the independent variables. Regardless, it is necessary to set up a ranking from fastest to slowest by whatever designation is desired by the user.

Next, in block 12, the corresponding dependent variable values are plotted in two-dimensional space. With the independent variables ranked in their

hierarchical fashion and plotted against the X-axis, the corresponding dependent variable value is plotted along the Y-axis. This gives a distribution of values in two- dimensional space. In some embodiments of the invention, the dependent variable values are already computed or known and read into a data file similar to the

independent variable values. In other embodiments, the dependent variables are calculated based on the

independent variable values. In either case, the

dependent variable values are plotted along the Y-axis. It should be noted that, similar to the flexibility in the ranking of the independent variables, it is possible to change the designation of variables from independent to dependent, and vice versa. Again, this produces different visual results which may be more useful in interpreting the data sets.

Next, in block 13, the hierarchical rectangles are drawn. The multi-dimensional graphing and two- dimensional space method and system works by displaying hierarchical rectangles in different colors. The

fastest-running variables are displayed in the present embodiment as "hash" marks. These hash marks can be thought of as rectangles having zero height. The next fastest variable then becomes a rectangle encompassing the fastest-running variable values throughout the range of the next fastest-running variable. This iteration of next fastest-running variables continues until the slowest-running variable is accounted for. This results in a nesting of rectangles as shown in Fig. 3. Fig. 3 is graphical representation using the present invention of the Ideal Gas Law. The Ideal Gas Law is described in the form P=nRT/V, where P is pressure, n is a number of moles, R is the gas constant, T is the temperature (in degrees Kelvin), and V is the volume occupied by the gas. The ranking of the independent variables T, n and V is shown is Fig. 2 in their corresponding positions on the X-axis. In this particular case, T is designated as the fastest-running variable and is illustrated along the number line as the smallest set of hash marks; n is the next fastest-running variable, and this is shown as the next largest set of hash marks. Finally, V is shown as the slowest-running variable, and this is illustrated as the largest set of hash marks.

It can be seen in Fig. 2 that the fastest- running variables are nested within each next fastest- running variable and then repeated for the negative value of the next fastest-running variable. This translates to T values of 1, 2, 3 and 4, while n=0 and V=0. Then, T runs through 1, 2, 3 and 4, while n=l and V=0, etc., until completed for all four values of n (1-4). This cycle is then repeated for all values of V from 1-4.

The result of this is shown in Fig. 3A. In Fig . 3A, three " colors " are used. Color selections are made by the operator. The system provides data

visualization using color information. References to the figures will be to the "colors" corresponding to the reference numbers. The use of color adds in visualizing the data. White hash marks 14 designate the fastest- running variable T, while blue rectangles 15 represent the next fastest-running variable n and, finally, orange rectangles 16 represent the slowest-running variable V. When looking at the graph, the value P is plotted along the Y-axis while the independent variable values are plotted along the X-axis.

The white hash marks 14 designate the T value. The white hash marks are connected by splines 17, whose purpose is to aid in the visual interpretation. They are not a requirement of the present method and system, but instead a useful interpretative tool. The splines are used in connecting groups of rectangles, i.e., fastest, next fastest and slowest-running variables. The blue rectangles 15 represent the value of n. Each blue rectangle encompasses four of the white hash marks (fastest-running variables), as there are four values of T for each value of n. The blue rectangles are also connected by splines.

Finally, the orange rectangles 16 which

represent the variable values V encompass four blue rectangles. This is a result of there being four values of n for each value of V. Thus, a nesting of rectangles in a hierarchical fashion illustrates the graph of the Ideal Gas Law. One can view the different groupings of variables at any point to interpret the data set, while at the same time seeing conditions on either side of that location for the entire set of variable values.

There is shown in Fig. 3B a graph representing the gaussian function w=e ^{- (x **2+y**2+z**2)} using the invention. Fig. 3B uses the same color representations as Fig. 3A. Accordingly, the reference numbers

correspond to the same "colors" as described above.

There is shown in Fig. 4 an overview of the program structure of an embodiment of the present invention. This system chart of the program structure shows the available tools all connected to a main event loop 27. Main event loop 27 is shown in Fig. 5. All tools and operational commands are initiated via

subroutine calls in the present embodiment of the

invention. The tools are shown in Figs. 6-13. In Fig. 5, there is shown a flow chart of the basic operation of main event loop 27. The main event loop is a central point of program flow. After the program embodying the present invention is initialized, it enters the main event loop and all subsequent actions are dispatched from here. An event is usually some user input requesting some action of the program. Once the main event loop is entered, it continually scans for an event (i.e., tool). When an event is detected

(received), the main event loop determines what action should be taken and issues the appropriate function calls (i.e., subroutine calls). After the function (tool) completes its execution, the main event loop resumes scanning for user input. In block 42, the main event loop awaits for user action. Then, after user action such as the

toggling of buttons on a mouse or pressing of keys on a keyboard, an appropriate function call is made in block 43. The function (or subroutine) is called and completed and then processing returns to the main event loop in block 44. The main event loop then keeps cycling waiting for user input before making the appropriate function calls.

There is shown in Fig. 6 a flow chart of the Zoom In tool (20 in unsealed mode or Zoom In tool 21 in scaled mode). The Zoom In tool reduces the

dimensionality of the plotted space. One of the

currently displayed second slowest-running variables is selected from the currently displayed slowest-running variable, and this selected second slowest-running variable becomes the currently displayed slowest-running variable. The net effect of this tool is to zoom in on one of the spaces or subspaces. This tool can be used for finding maxima and minima.

In the scaled version, the zoomed- in subspace is proportioned to the size of the display screen. In the unsealed Zoom In, the selected subspace is kept in its original proportion as in the space from which it was selected. Unsealed tools allow the user to see

tendencies, such as decay and growth. The Zoom In tool operates by grabbing the position of the variable space to be zoomed in, as shown in block 45. This can be accomplished by pointing to the variable space (rectangle) using a mouse or other

pointing device. Next, the plotted space (displayed space) is set to the subspace selected in block 45. This is shown in block 46. In block 47, the display space is repainted in either scaled or unscaled version showing the zoomed-in subspace. In block 48, control returns to the main event loop to continue scanning for new events. There is shown in Fig. 7 a flow chart of the

Zoom Out tool. The Zoom Out tool corresponds to block 22 (in unsealed form) and block 23 (in scaled form) of Fig. 4. The Zoom Out tool works inversely to the Zoom In tool and, as such, increases the dimensionality of the plot. Note that the dimensionality cannot be increased above the maximum starting value. The subspace which runs slower than the currently displayed slowest-running variable becomes the currently displayed slowest-running variable. This, again, is up to the maximum starting value. An index is kept when zooming into subspaces so that the control system of the present invention can monitor the level of display of the current displayed space. In block 49, the previous (zoomed-in) subspaced index is retrieved. In block 50, the plotted space

(displayed space) is set to this subspace index value. In block 51, the plotted space is repainted to the subspace corresponding to this index value. Then, in block 52, control returns to the main event loop and scans for new events. This can be accomplished in numerous ways, and in the present embodiment is operated by clicking on one of the mouse buttons. It could just as easily be configured to work via keyboard commands. There is shown in Fig. 8 a flow chart for the

Animate tool. The Animate tool sequentially displays each subspace in the currently displayed slowest-running variable. In the present embodiment, the sequential display cycles continually over the subspaces until the user terminates the animation. It is possible in other embodiments to set the cycling to a designated number. It is also possible to have a manually operated cycling operated by a pointing device such as a mouse, or through keyboard commands. The Animate tool can be operated in unsealed mode 24 or scaled mode 26. As with the other tools, scaled mode proportionally adjusts the currently

displayed subspace to fill the display screen, while unsealed mode maintains the sizing of the designated subspace without adjustment. The Animate tool operates by first retrieving the subspace index in block 53. At decision block 54, it is determined whether to continue with the Animation process. Should the user desire to continue, processing moves along to block 55, where the plotted space is set to the subspace index. In block 56, the plotted space is repainted according to the display for the subspace index. This will be in either scaled or unsealed mode, depending on the user's selection. In block 57, the next subspace index is obtained. Processing then returns to decision block 54, and the user determines whether or not to continue with the animation. Should the decision be "NO," processing continues to block 58 where the plotted space is set to the last subspace index. The screen is then repainted in block 59, and processing returns to the main event loop in block 60.

There is shown in Fig. 9 a flow chart for the Expander tool. The Expander tool is applied about a particular point in the multi-dimensional space, and displays the variation along each independent variable using a homogeneous horizontal increment for each

variable rather than the hierarchical increment which is the basis for the multi-dimensional graphing and two- dimensional spaced method and system. The Expander tool takes a section of each variable through the point expanded upon, but does not sample all points in the display space. This tool is useful for tasks such as finding minima and maxima.

The expander tool allows one to view how the dependent variable changes as one moves away from the point in question in the white independent variable direction until one reaches the edges of the data domain, similarly for the blue, red, etc., independent variables. This tool can clearly be generalized by showing

variations as one moves away from the point of expansion in more complex ways that involve non-parallel moves.

For example, one could show the variations that occur when, in addition to the standard expander tool moves, one also displays moves about each point that correspond to incrementing all of the other colored variables by ± one. Further generalizations can involve all possible moves about the new points until, in fact, one could show all possible paths through the N dimensional space.

In block 61, the position or point is grabbed. In block 62, a new window is created for displaying the results of the expansion. In block 63, the lines

representing the expansion through the point are painted. The painting of the lines is completed in the colors representing the corresponding independent variables. In block 64, processing returns to the main event loop. There shown in Fig. 10 a flow chart for the operation of the General Zoom tool. This General Zoom tool sets the limits, left and right, of the currently displayed slowest-running variable. The General Zoom tool does not change the currently displayed slowest- running variable. This tool is useful for showing portions of the currently displayed subspace. The

General Zoom tool can be used in scaled (block 37) or unsealed (block 38) mode. The scaling and unsealing is exactly the same as has been described for the previous tools. This, as always, is a user designation. In block 65, the position on the X-axis is obtained for the left and right boundaries. In block 66, these boundaries are set as left and right limits. In block 67, the subspace is set to the left and right limits which were set in block 66. In block 68, the displayed subspace in repainted with the new left and right boundaries. In block 69, processing returns to the main event loop.

In the present embodiment of the invention, the General Zoom tool applies for all subspaces that the user now goes into and out of. This is a design choice, and is not a limitation of this tool in the present

invention. Also, the General Zoom can be reset to the original limits in block 33 of Fig. 4. There is shown in Fig. 11 a flow chart for the

Decimate/Undecimate tools 32, 34, 35 and 36. As with the other tools, the Decimate and Undecimate tools operate in a scaled or unsealed mode.

The Decimate tool decreases the number of currently displayed slowest-running variable subspaces by only plotting every Nth subspace, where N is the level of decimation. The Undecimate tool operates in the opposite manner, but is limited to undecimating only decimated subspaces. Without the Decimator tool, an obvious drawback to this embodiment of the invention is that each data point uses at least one horizontal pixel. Since work station monitors generally have about 10³ pixels

horizontally, this obviously limits the number of total data points displayed at any one time to 10³. This is despite the fact that multi-dimensional problems tend to require large numbers of data points.

The Decimator tool allows a fraction of the total distinct values for each variable to be displayed. In many cases, this still allows useful interpretation of the data and functions. For example, in a data set that has 10⁶ data points, one can show only the first, fourth, seventh and tenth values for each variable, hence, reducing the total number of points that need to be displayed to 4⁶, or 4096. This makes it necessary to scroll only four frames, instead of one thousand, to see the "entire" data set.

In order to get a detailed look at a particular subspace, the zoom tool can be used. It is also possible to decimate certain variables in certain increments, while other variables in other increments. In another possible embodiment of this invention, a combination Zoom and Decimator tool is used for handling large data sets.

In block 70, the decimate level is set. This can be done through clicking the buttons on a pointing device, such as a mouse (in the present embodiment), or through keyboard input. In block 71, the subspace is repainted incorporating the decimation or undecimation level. Finally, in block 72, processing control returns to the main event loop.

There is shown in Fig. 12 a flow chart for the Permute tool. The Permute tool changes the hierarchical assignment of the independent variables. The starting assignment is used as a reference for all future

assignments. The functional dependence remains unchanged after using the Permute tool. It is only the order in which the data are plotted which is changed. In short, the Permute tool allows for the exchange of the rankings of the independent variables. This is very useful for determining which ranking gives the most useful or most beneficial visual results.

A related tool is the Array Plot tool 31 of Fig. 4. The Array Plot tool can show all or some

combinations of rankings of independent variables in the display space. This allows the user to select which ranking gives the best or desired visual results.

In the present embodiment of the invention, the Permute tool works between pairs of variables. This pairwise exchange has been found to be a very practical way of using the Permute tool , but is not a limitation of the present invention.

In block 73, the subspaces are set to permute. In block 74, the data is rearranged according to the new ranking of the independent variables. In block 75, the displayed space is repainted according to the

permutation. Finally, in block 76, processing control returns to the main event loop.

There is shown in Fig. 13 a flow chart for the Cloning tool 30. The Cloning tool simply makes a copy of the currently displayed plot and places it in a window in another part of the screen. This allows the concurrent display of various subspaces. These displayed subspaces can be operated on by various tools to show an overall picture for the user. It can also be used to show different "zooms" at the same time for the user. In block 77, the current plot is painted into the cloning space. In block 78, the clone spot is repainted onto screen. In block 79, processing control returns to the main event loop. There are other tools in the program structure of Fig. four 4, such as a Resize Tool 40 and Resize Panel 39. The Resize Tool is used for changing the size of the display space. The Resize Panel is used for changing the size of the display panel which, monitors the operation of the display space and the various tools operating on it at any given time.

The Splines tool 29 draws lines between

rectangles to be used as a guide for the eye. The

Splines that are drawn between rectangles are drawn according to the following criteria:

Y = ymin |ymin| > = |ymax|

Y = ymax |ymin| < |ymax| where ymin and ymax are the minima and maxima of the rectangle through which the spline is drawn. The splines are drawn hierarchically, joining rectangles of the same subspace together.

The nested hierarchical rectangles of the present invention correspond to the behavior of the dependent variable W over independent variable subspaces of various dimensionalities. The following formulas are for the vertical and horizontal locations and extents of these rectangles in world coordinates (not screen

coordinates). The corresponding screen coordinates would be measured from the lower left corner of the X window and would, in general, be offset in the scale for each independent variable to reflect the fact that the

starting values of each variable may not be zero, and the increment value may vary from one independent variable to the next.

It is useful to denote the independent variables as X₁, X₂ ... X_n (instead of X_white, X_blue, where the colors pertain to the rectangle colors). Here, X₁ is the fastest-running variable, X₂ is the second fastest-running variable, and so on. Associated with each value of X₁ is an independent variable subspace of dimension d=0, i.e., a point. Associated with each value of X ₂ is an independent variable subspace of dimension d=1, i.e., a line (of points). In general, each value of X_L corresponds to an independent variable subspace of dimension d=L-1, and has a corresponding rectangle (which may be thought of as corresponding to a subspace of dimension L, i.e., L-1 independent variables along the horizontal and one dependent variable, namely w, along the vertical).

Each independent variable X_L takes on values

X_L,i=X_Ls+ (i-1)Δ X_L

with i=1 to N_L

In general, the starting values X_LS may differ, as may the increments Δ X_L and the total number of values N_L. In the formulas given below, we will set X_LS=0 and X_L=1 which, in fact, corresponds more closely to the actual screen displayed rectangles and is essential in order to obtain correct formulas for the locations and extents of the rectangles.

Formulas

A) The number of Rectangles of Each Type: the total number of X₁ rectangles = N_rect1=

which corresponds to the number of points (d=0) in the independent variable space.

the total number of X₂ rectangles = N_rect2=

which corresponds to the number of lines (d=1) along the X₁ direction.

the total number of X₃ rectangles = N_rect3=

which corresponds to the number of planes (d=2), i.e., (X₁, X₂) planes.

in general N_{rectL =}

and corresponds to the number of subspaces of dimension d=L-1, i.e., (X₁,X₂ .. X_L-1) subspaces.

B) The Vertical Extent of the Rectangles The vertical extent of a rectangle corresponding to a particular value of X_L, say, the i^th value (hence, corresponding to an independent variable subspace of dimension d=L-1), is given by the difference between the maximum value of the dependent variable in that subspace W_L,i,max and the minimum value w_1,i,min i.e., Δ V_L,i = W_L,i,max - W_L,i,min

C) The Horizontal Extent of the Rectangles The horizontal extent of a rectangle equals the sum of the corresponding horizontal extents of smaller rectangles within it (which correspond to lower dimensionality)

Δ h_L,i = Δ h_L= N_L-1 Δ h_{L - 1} = N_L-1N_L-2 Δ h_1-2

= N_L-1N_L-2 .. N₁ Δ h₁ N₁ since Δ h₁ = 1

It is useful to define N_O = 1 and to re-write h_L as

Δ h_L=

D) The Vertical Location of the Rectangles

The bottom of the rectangle corresponding to a particular value of X_L is given by

V_bottom L,i ⁼ W_L,i,min

The top of this rectangle is given by V_{top L,i} = W_L,i,max

E) The Horizontal Locations of the Rectangles

The left edge of a rectangle is located at H_L,left =

Here, the set of integers J_L' J_L+1 ... j_n specify which X_L rectangle (i.e., which subspace of dimension d=L-1) one is referring to. Since Δ h_L,left depends on the set {j_k }k≥ L , one could explicitly write

Δ h_L,left ({J_k}k≥L).

This expression can be made obvious if one uses the result from C above, namely Δ h_L= N_i or Δ h_k =

Hence, h_L.left= (j_k-1) Δ h_k

= (j_n-1) Δ h_n+(j_n-1-1) Δ h_n-1 + ... + (j_L-1) Δ h _L that is, the sum of moving to the right by (j_n-1) largest rectangles of width Δ h_n Plus (J_n-1-1) next largest etc. plus finally (j_L-1 Δ h_L. Again, the set of integers j _{n ,} j_n-1 ... j_L specify which X_L rectangle (i.e., which subspace of independent variable dimension d=L-1) one is referring to.

The right side of the rectangle and its center are given by h_L, right = h_L, left + Δ h_L h_L, right = h_L, left + ½ Δ h_L

The Effects of the Various Tools

The Zoom In Tools (scaled and unsealed) reduce the dimensionality n. The Zoom Out Tools (scaled and unsealed) increase the dimensionality n (up to the maximum starting value).

The Animate Tools (scaled and unsealed) increment the value J_L. The General Zoom Tools (scaled and unsealed) reduce N_L with values of X_L remaining contiguous, i.e., X_L constant.

The General Zoom Reset Tool restores N_L to its original value. The Decimate Tools (scaled and unsealed) reduce N_L with X_L increasing.

The Undecimate Tools (scaled and unsealed) restore N_L to its original value.

The Permute Tool interchanges two variables say X_i

X_j hence, in general, affecting N_i and N_j (and in general the pattern of hierarchical rectangles unless N_i=N_j, {X_i}={X_j} and w has the exact same functional dependence on X_i and X_j).

The Resize Tools simply alter the size of the X window or slider widgets (Resize Panel tool). The Clone Tool simply clones an existing X window.

The Expander Tool is applied about a particular selected point in the multi-dimensional space and displays the variation along the variable (X₁ ) direction, variable (X₂) direction, etc. using a homogeneous horizontal increment for each variable rather than a hierarchical increment.

That is, the Expander Tool displays

W(X₁, X_2selected, X_3selected . . . X_nselected) vs x₁ and W(X_1selected, X_2' x3selected . . . X_nselected) vs X₂

W(X_1selected, X_2selected, X_3selected, X_n) vs X_n as simple color-coded x,y plots.

There is shown in Fig. 14 an example of a computer system 80 on which the present invention can be run. It is comprised of monitor 81, CPU and mass storage device 82, keyboard 83 and mouse 84. Computer system 80 can be in many configurations.

In the present embodiment of the invention, the multi-dimensional graphing in two-dimensional space software was developed and is being run on a Hewlett Packard model 330CH computer, which is described generically as a Motorola 68020 microprocessor, a

Motorola 68881 floating point co-processor, a 1280x1024 8-plane graphics card, and 4 megabytes of dynamic RAM. The operating system being used is Hewlett Packard HPUX version 7.0. The program environment is C using the HPUX C-compiler. The graphics environment is the X Windows System™, as implemented by Hewlett Packard in HPUX 7.0. Those skilled in the art will understand that the present method and system are not limited to this computer system and operating environment. In fact, successful operation of the system has been accomplished on a Sun SPARC station 1 and a Sun 3, both running Sun Operating System; Solbourne computers running Sun Operating System; and 386 machines running Interactive Unix. There is shown in Fig. 15 an irregularly spaced grid 15A and the corresponding plotting of rectangles 15B.

In irregularly spaced independent variable grid 15A, there is shown multiple data points with spacings Δ1' Δ 2 ' Δ3' Δ4 along the Y-axis and spacings δ₁, δ₂, δ₃ along the X-axis.

Fig. 15B is one possible rendering involving hierarchical rectangles for the function W = X² + Y².

Note that horizontal gaps 155 appear between the first of the faster running rectangles 151 and the second of the faster running rectangles 152, but that all the faster running rectangles are of the same width. Similarly, a gap 156 appears between the second of the slower running rectangles 153 and the third 154 but that all slower running rectangles are of the same width. Even though the Δ and δ distances shown in Fig. 15A are integer multiples of Δ _i and δ_i, it is not a limitation of the system.

There is shown in Fig. 16 a case of non-grid sampling of independent variables which are not

completely random. A variety of samplings of the

independent variable space which are not grid-like, and which are also not perfectly random are possible. Using the data set in Fig. 16A, and applying the function W = X² + Y², the rendering involving hierarchical blue rectangles 161 in Fig. 16B comes about. Note that the widths 162 of the rectangles vary to reflect the extent of the X-axis variable being sampled.

The invention can be extended to the case of randomly sampled independent variables in other ways. First, one could use multi-linear interpolation or more advanced methods to evaluate the dependent variable over a standard grid and then use the invention. Second, one could first treat the "dependent" variable on the exact same footing as the independent variables and perform a multivariate binning. In this case, the number of points in an N+1 dimensional bin (i.e., N original independent variables plus the original dependent variable) would become the new dependent variable and the newly quantized (via the binning process) old dependent variable would be mapped to the hierarchical horizontal axis. This would allow one to look for correlations between the horizontal axis variables. Figs. 17A and 17B shows the simple case of no correlation between the original dependent variable and one independent variable. Figs. 18A and 18B shows the simple case w=x² with no noise and, hence, prefect correlation. The uncorrelated case will show, for example, gaussians of gaussians if the variables are normally distributed. The important point is that the distributions differ from one value of the slower variable to the next only in their amplitude. Note, however, that in the correlated case, the distributions clearly evolve in an orderly fashion not involving a simple amplitude scaling. Just as one can replace standard 2 dimensional

Cartesian x,y plots by the present invention, wherein the dependent variable is plotted along the vertical axis while all independent variables all plotting

hierarchically along the horizontal axis, one can replace standard 2d color maps, where the independent variables, say x and y, are plotted along the horizontal and

vertical respectively, and color is used to denote the value of the dependent variable, in which both the vertical and horizontal axes are hierarchical. That is, some independent variables are mapped hierarchically to the horizontal axis, and the rest are mapped

hierarchically to the vertical axis. In this case, the color of the resulting nested rectangles could be

determined by the values of the dependent variable over the corresponding subspace in a variety of ways, such as the maximum within the subspace, the minimum within the subspace, etc., One could also color only those

rectangles that have values falling within a specified range of the dependent variable, the remaining rectangles being shown in black. In Fig. 19, this scheme is shown for the case

Here, w is the dependent variable and x₁, y₂, z₃ and r₄ are four independent variables, each of which takes on values of -1, 0 and 1. Therefore, the total number of points is 3⁴ =81. The information would typically be displayed in color. In Fig. 19, the letters A, B, C, D and E are used in the blocks to designate colors (i.e., all blocks with the letter A would show the same color when displayed). Here, a complete set of tools analogous to those described above could be used for both the independent variables and the dependent variable.

In multi-dimensional graphing in two- dimensional space, it is possible to produce graphs using two or more dependent variables. The prior examples illustrate cases using one dependent variable and

multiple independent variables.

In cases where the multiple dependent variables are defined for the same set of independent variables or for some common subset of independent variables, it can be very useful to display all of the dependent variables in the same graph. This allows for visualizing possible correlations which may be occurring between the dependent variables for certain combinations of the independent variables.

One way of accomplishing this is to establish one or more new independent variables which are

associated with the set of dependent variables. The new independent variables which refer to dependent variables are called dependent variable selection, or DVS

variables. For example, in plotting R-dependent variables, each of which is a scalar, a DVS variable would be established having values 1, 2, 3, . . . R. The collection of R dependent variables in this example can be thought of as a vector, or a singly subscripted array (i.e., A_i with i=1 to R).

This singly subscripted array may not correspond physically to a vector. In some cases, it could refer to a collection of variables, such as

specific heat, lattice constant, magnetic susceptibility, and thermal conductivity. This collection would not normally be thought of as components of a vector, but could very well be part of a materials properties

database. It must be repeated that the applicability of the present invention is not limited to mathematical formulas, but rather to all functions. As functions are defined as a locus of points, many forms of data are applicable for graphing with the present invention. This includes database information, statistical information, matrix information, and mathematical formulas. Another example of dependent variables could be the x, y, and z components of a vector representing an electric field.

Whether it is a simple database component or a component of a mathematical formula, the use of new independent variables representing the DVS allows for multiple dependent variable representation. In the case of a materials property database, a new independent DVS variable could be established having a value of 1

representing specific heat; a value of 2 representing the lattice constant; a value of 3 representing magnetic susceptibility; and a value of 4 representing thermal conductivity. In the case of the electric field vector, a value of 1 could represent the x component of the vector; the value 2 could represent the y component; and the value 3 could represent the z components of the vector. It is also possible to portray the multiple dependent variables as doubly subscripted arrays of the form A_ij. In this example, two DVS variables are

created, with one covering the range of integers i (i=1 to Ri) and the other covering the range of integers j (j=1 to R_j). For this doubly subscripted array example, the number of scalar dependent variables is R_i * R_j, as long as there are no missing values.

The collection of values for A_ij might actually correspond to a property which is usually regarded as a tensor or a matrix.

The collection of data values may also be organized into R_i categories, with each category having one or more properties (dependent variables) within it. In this case, the first dependent variable (the i in A_ij) could be thermodynamic properties. The second dependent variable (the j in A_ij) could represent several different properties within this variable. This organization structure would be repeated for the other first dependent variables. In all cases, the multi-dimensional graphing in two-dimensional space allows for database

visualization and the use of the tools that have been previously described.

Further examples could involve multiple

dependent variables that involve arrays with three or four or more subscripts by introducing three or four or more DVS variables to represent them. In each case, the DVS variables are treated in the same manner as all other independent variables. Further, the DVS variables will also function with non-regular grid values (grid spacing) as described with reference to Figs. 15-18.

Shown in Figs. 20-23 are examples where one DVS independent variable has been added for the case of three dependent variables.

There are shown in Figs. 20-23 examples of the present invention, where one DVS independent variable has been added corresponding to three dependent variables. In Fig. 20A, there is shown a multi-dimensional graph in two-dimensional space for a dependent variable A versus two independent variables, t and h. Here, t is

represented as the fastest-running variable and h as the slowest-running variable. Each independent variable in Fig. 20A has three associated values.

There is shown in Fig. 20B a graph for a second dependent variable B. The values of B can be normalized so that the numerical value of the maximum value of B (B_max) is identical to the maximum value of A (A_max) of Fig. 20A. In general, the values of the independent variables of A and B can be different. The two

independent variables in Fig. 20B are identical to those in Fig. 20A, with t being the fastest-running variable and h being the slowest-running variable.

There is shown in Fig. 20C a graph for a third dependent variable C. Again, the values of C have been normalized so that C_max = A_max The two independent variables t and h are identical to those in Figs. 20A and 20B, with t as the fastest-running variable and h as the slowest-running variable.

If the dependent variables A, B and C have the same unit or dimension, as in the case of the three components of a vector, it might not be desirable to normalize them so that A_max = B _max = C_max. This is obviously dependent upon the application for which the present invention is being applied.

There is shown in Fig. 21 a graph of the results after defining a new DVS-type independent

variable. In Fig. 20, the DVS independent variable is the slowest-running variable, t is the fastest-running variable, and h is the second fastest-running variable.

There is shown in Fig. 22 a graph of the same information as in Fig. 21, except that the DVS

independent variable is displayed as the fastest-running variable, while t is the second fastest-running variable, and h is the slowest-running variable.

There is shown in Fig. 23 a graph of the same information as in Figs. 21 and 22, except that t is now the fastest-running variable, the DVS independent

variable is the second fastest-running variable, and h is the slowest-running variable.

There is shown in Fig. 24 a multi-dimensional graph in two-dimensional space, wherein the rectangles have a width corresponding to the independent variable value. The vertical or height of the rectangles from zero on the y-axis is determined by summing the heights or vertical distances of the preceding slowest-running variable (the largest rectangle contained within the rectangle of interest). This provides a different visualization of the function being graphed than the manner of graphing where the rectangle's vertical

boundaries were based on the minimum and maximum values of the preceding rectangles (largest rectangles contained within the rectangle of interest).

The rectangles are colored, as in the other cases. An additional feature allows the user to select which rectangle is to be drawn first, and, therefore, could possibly be masked or partially masked by later drawn rectangles. The user also may select the order in which the remaining rectangles are drawn. When the graphing of the rectangles is complete, the user can choose to redraw any particular rectangles to account for masking.

To illustrate drawing the rectangles, consider a rectangle corresponding to an independent variable subspace of dimension r. The non- zero vertical end

(other vertical extreme) of this rectangle, V_r, is given by the equation

n_r-1

V_r= ∑ V_r-1,i

i=1 where V_{r-1, i} is the non- zero vertical extreme of the i^th rectangle corresponding to a subspace of dimension r-1 and n_r-1 is the total number of rectangles corresponding to a subspace of dimension r-1. The graph shown in Fig. 24 shows a multidimensional graph in two-dimensional space with

rectangles based on the sum of the next largest

rectangles contained within. For the simple three independent variable case of Fig. 24, black is the fastest variable, blue is the next-fastest and red is the slowest-running variable. The black and blue variables have three values, while the red variable has four values. In this case, the dependent variable is positive definite (i.e., either positive or zero).

There is shown in Fig. 25 another graph where the height of the rectangles are determined by summing the values of the next-largest rectangle contained within the rectangle of interest. The graphing of rectangles using minimum and maximum as for graphs prior to Fig. 23 can be expressed recursively as:

minimum of all V_r-1 within the r

subspace rectangle width of interest; and

maximum of all V _{r- 1} within the r

subspace rectangle width of interest; and

= = W the value of the dependent

r _{variable at} the point of

interest.

Similarly, the graphing of rectangles using the summation of the next-largest rectangles contained within the rectangle of interest can be expressed recursively as:

= o;

nr-1

and

Vr-1,i

vertical (other

extreme) i=1 vertical

extreme)

and = o;

and V

= W the value of the dependent

variable at the point of interest, vertical

extreme)

A simple variation on this scheme would be to use the average, i.e., nr-1

1/n_r-1 * ∑ V_r-1,i

vertical i=1 (other extreme) vertical

extreme) In Fig. 25, the black rectangles are the fastest-running variable, the blue rectangles are the next fastest, the red rectangles are the next-fastest, and the yellow rectangle is the slowest-running variable. Different from the other figures, it can be seen that certain rectangles contain slower-running variables having a vertical dimension which exceeds the vertical dimension of the slower-running variable. These

rectangles exceed the slower-running variable in both the positive and negative directions, as shown at red₁, blue₂, black₂ and red₃, blue₂, black₃, respectively.

With the use of negative values, it is possible for a nested rectangle to exceed the sum of the nested

rectangles, as a negative value takes away from the summed value. In this particular graph, the summed values are not summed absolute values.

In Fig. 25, the base of each rectangle is always at zero, and the other vertical extremity is based upon the sum of the next slowest-running variable

contained within.

There is shown in Fig. 26 a graph for a case where the vertical extreme (the other end of the

rectangle away from the zero base) is a new function of the vertical extremes of rectangles corresponding to r-1 dimensional subspaces within the r-dimensional subspace.

It can be seen that one vertical extreme is set to zero, while the other vertical extreme is obtained by summing the non-zero vertical coordinates of a subset of the nested rectangles according to the following formula:

V_red (other = ∑ V_blue

vertical (other vertical extreme).

extreme) i = red

variable

value

It is possible to let the functions depend on which subspace of dimensions r is being considered. In this case, the functions could depend on the values of slower-running variables. In general, both vertical extremes of a given rectangle could depend on any

function of the vertical extremes of smaller rectangles and/or the values of all slower- and faster-running independent variables. The embodiments described to this point have involved only a few specific aspects of the invention. The present invention is not limited to these embodiments only. The descriptions so far have, for the most part, involved only a single horizontal hierarchical axis.

That is the previously described rectangles are only arranged hierarchically in the horizontal direction. The direction of the hierarchical arrangement is the

direction of the hierarchical axis. In this case of the previously described material and embodiments, the single hierarchical axis was oriented in the horizontal

direction. The present invention does not limit the arrangement to a single hierarchical axis, nor does a hierarchical axis have to be oriented along the

horizontal direction. In addition, the present invention does not limit the symbols to be rectangles. To describe a multiple hierarchical axis scheme and a more general symbol representation of the present invention, a few new concepts need to be introduced. The concepts are the hierarchical cell and the hierarchical symbol. The term multi-dimensional graph in a two-dimensional space will be replaced by the acronym MGTs from this point forward.

A hierarchical cell is a "container" of a hierarchical symbol. Its function is to determine the position of a hierarchical symbol on the two-dimensional display plane. Any given hierarchical cell is directly related to a collection of independent variable values. Exactly how these collections of independent variable values relate to the cells is determined by the current ranking of the independent variables themselves.

Generally a hierarchical cell contains a grouping of other hierarchical cells which can in turn contain yet another grouping of hierarchical cells. Each level of this hierarchy corresponds to an independent variable. Thus, the ranking of the independent variable determines the hierarchy of cell nesting. The slowest running variable is at the top of the hierarchy and the fastest running variable is at the bottom. Therefore, the slowest running variable will run over all of its values one time. Each one of the values of the slowest running variable will have a hierarchical cell assigned to it. These cells are called the slowest running cells. These cells can be arranged geometrically in the horizontal and/or vertical direction (or, in fact, in any fashion). Each slowest running cell corresponds to a value of the slowest running variable. Within each slowest running cell is a grouping of second slowest running cells which correspond with the values of the second slowest running variable. These second slowest running cells can be arranged geometrically in the horizontal and/or vertical direction (or, in fact, in any fashion). Each second slowest running cell corresponds to a value of the second slowest running variable. As can be seen, there is a complete grouping of second slowest running cells for each slowest running cell. Therefore, unique

characterization of a second slowest running cell must include a specification of which slowest running cell it belongs to, as well as its location within that

particular slowest running cell . This hierarchy of cells continues until the fastest running variable values or cells are reached. The end result is a mapping of an n- dimensional space to a two-dimensional space. The cells themselves can be any size or shape. A hierarchical symbol is associated with a hierarchical cell. Generally, but not exclusively, a hierarchical symbol is inside the hierarchical cell. A hierarchical symbol is a representation of a dependent variable value or values, or a representation or two or more dependent variable values, or, in general, some result or results which are associated with the

independent variable values corresponding to the cell in which the symbol is displayed. In general, each

hierarchical symbol consists of a hierarchical collection of symbols or other hierarchical symbols which are displayed on the underlying hierarchy of cells. At any level of the hierarchy, a symbol can be any geometric shape, and, in fact, could be a collection of non- hierarchically related symbols forming a composite.

Hierarchical symbols need not be hierarchical at every level of the hierarchy, and, in fact, may not be

hierarchically related at all. An example of this can be seen when some of the independent variables are used purely as ordering instruments. In this case, each hierarchical symbol corresponding to the ordering

variables could be an entire MGTs graph. Thus, the end result is a collection of MGTs graphs which may or may not be directly related to one another. Hierarchical cells can be viewed as the rulers or scales or matrix upon which hierarchical symbols are displayed. As such, hierarchical cells determine the positions of hierarchical symbols. A hierarchical symbol determines or represents the shape of the image which is representing the underlying data. In terms of the embodiments already described, the hierarchical symbols are the hash marks and rectangles which can be shown as various representations or renderings of the underlying data. It is equally possible and acceptable for

hierarchical symbols to consist of circles or ellipsoids or other shapes wherein the radius or diameter of the symbols, for example, has some relation to the underlying data.

Fig. 27 shows an example one hierarchical axis cell arrangement 270 and a two hierarchical axis cell arrangement 280 of the present invention. In this example, some of the cells are shown larger than they would normally appear, in order to show how cells are hierarchically contained or nested - one within another.

In Fig. 27 there is shown graph 270. Graph 270 is a one hierarchical axis cell arrangement for a data set having four independent variables, each of which has two values. The independent variables are variables used to construct a cell space. Thus, an independent variable value has a one to one relationship with a cell. In addition, a collection of independent variable values also has a one to one relationship with a cell. In other words, the slowest running cell is a collection of the next slowest running cells which in turn is a collection of the next slowest running cells until the fastest running cells are reached. You will note that the terms "fastest running cell" and "slowest running cell" are being used instead of the term "fastest running variable" and "slowest running variable" as the term cell is more specific regarding a given configuration of values of the independent variables which can be arranged in any ranking from fastest to slowest, slowest to fastest, etc. A cell is an index of a specific configuration of

independent variable values.

Going back to Fig. 27, there is shown graph 270 and graph 280. Graph 270 and graph 280 show a

hierarchical arrangement of cells labeled in legend 275. The fastest running independent variable cells are shown as white rectangles, the second fastest running

independent variable cells are shown as slightly darker rectangles, the third fastest running independent

variable cells are shown as even darker rectangles and the slowest running independent variable cells are shown as the darkest rectangles. For purposes of illustration of the one hierarchical and two hierarchical cell

arrangements, some of the rectangles are enlarged to show the nesting of the rectangles. Had some of the

rectangles not been enlarged, they would overlap each other, and the nesting would not be clearly visible.

In graph 270, fastest running independent variable value cells 271a and 271b are shown nested

(contained) in next fastest running independent variable value cells 272a and 272b. The next fastest running independent variable value cells (to 272a and 272b) are 273a and 273b. The slowest running independent variable value cells are 274a and 274b. Though each rectangle is not indicated with a reference number, it should be clear that each next slowest running independent variable value cell incorporates all of the next fastest running

variable value cells within it. The nesting of cells takes place in the horizontal axis only. In graph 280, the arrangement of the cells is organized in both the vertical and horizontal directions . The number of independent variables, the ranking of the independent variables and the number of values for each independent variable is equal to that shown in graph 270. It should be noted, however, that this is not a

requirement of the invention.

In graph 280, the reference numbers pointing to the independent variable values correspond to the ranking as described for graph 270. In graph 280, a two- hierarchical axis cell arrangement, the fastest running independent variable value cells 271a and 271b are arranged along the horizontal axis. The next fastest running independent variable value cells 272a and 272b are positioned along the vertical or y-axis. Continuing along, the next fastest running variable value cells (after 272a and 272b) are 273a and 273b. These

independent variable value cells are arranged along the horizontal axis. Finally, the slowest running

independent variable value cells 274a and 274b are arranged along the vertical axis. An example of contrasting the two types of arrangements is apparent if a large set of data (or a function with a large set of values) was being displayed. Graph 270 might not be able to display all of the

information, because it can only nest cells in the horizontal direction. Therefore, the number of cells that can be displayed is limited by the number of display pixels in the horizontal direction. This, of course, is related to the computer equipment being used. By

displaying the information in an arrangement similar to graph 280 (a two- hierarchical axis cell arrangement), a larger number of hierarchical symbols can be displayed by utilizing additional space in the vertical (or y-axis) direction for cell arrangement. In doing so, the dynamic range or the size of the symbols which can be displayed for a given cell is reduced. Therefore, there is a trade off between dynamic range of a symbol and the number of symbols which can be displayed from a one hierarchical cell arrangement to a two hierarchical axis cell

arrangement.

The cells themselves do not have to be rectangular. It is possible to construct the cells using circles (rather than rectangles) as well as many other geometric representations. This depends on the way a user desires to see information represented. Also, certain information may be better displayed with a non- rectangular cell arrangement.

Thus, it is not a requirement of the invention to display information using rectangular symbols.

Circles, ellipsoids, or combinations of various geometric shapes can be used to display different information. For instance, arrow heads representing vectors can be

combined with rectangles or circles to convey different components of information for a particular data set or event or function. By combining various symbols, more information may be visually available for a particular cell.

The type of hierarchical symbol to be used in an MGTs graph depends on the results. (Which can come from functions, data, dependent variables, etc. and the operators: min/max, sum, mean, etc.) Some symbols which can represent a single result are a horizontal or

vertical line, a rectangle, a circle, an ellipse, a triangle, etc. Some symbols that can represent two results are a combination of horizontal and/or vertical lines, a rectangle, a rectangle inside a rectangle, a circle inside a circle, an ellipse, an ellipse inside an ellipse, etc. In general, for an arbitrary numbers of results, a symbol or group of symbols can be combined with other symbols to form a composite symbol. In all cases, any symbol attribute can represent a result or results. For example, a symbol's position, size, angular orientation, construction, color, etc. can all represent a result.

An example of how hierarchical symbols can be constructed from the underlying data associated with the hierarchical cells follows. In the present invention various operators are used to construct a symbol in a given cell. The previously discussed min/max operator takes the minimum and maximum of the results in all cells or selected cells contained in the cell of interest, the cell in which the symbol of interest is displayed. A symbol is constructed from the minimum and maximum obtained, a rectangle for instance, and displayed along with the symbols of the contained cells from which the minimum and maximum where obtained. The resulting collection of symbols "within" symbols (although the symbols do not have to geometrically contain one another) are hierarchical min/max symbols. These hierarchical min/max symbols can be displayed in cells arranged in a one hierarchical axis cell arrangement, two hierarchical axis arrangement, etc. The resulting MGTs graph could then be called a one hierarchical axis min/max graph, a two hierarchical axis min/max graph, etc. The min/max operator is only one of many possible operators including sum, minimum, maximum, mean, standard deviation, standard deviation of the mean, etc. Operators can be applied to a single dependent variable value to produce one or many results to be represented by a symbol. Operators can also be applied to multiple dependent variable values to produce single or multiple results. Examples of

operators which produce multiple results are the min/max, mean ± standard deviation of the mean, etc.

Once hierarchical symbols have been

constructed, scaling tools need to be made available for the purpose of comparing various symbols. For instance, there does not have to be a global scale associated with the display of all symbols. Symbols may be grouped and the associated groups assigned their own scales. Scaling in this way provides many ways to visualize symbols generated from a given set of results.

In Fig. 29, there is shown several examples of hierarchical symbols and cells. Cell 290 shows a

horizontal line 291 corresponding to a single result. In cell 292, there is shown a filled vertical bar 293 which corresponds to a single result for that cell. In cell 294, a two hierarchical axis cell, there is shown a symbol 295 which corresponds to a single result. Symbol

295 is rectangular and corresponds to a single result by taking the appropriate percentage of the cell's size or area. In cell 296, there is shown a rectangle 297. Cell

296 is one hierarchical axis cell. Rectangle 297 has an upper vertical boundary 298 and a lower 299, thus

representing two results. Cell 300 shows a three result symbol; 301 consisting of a first rectangle 302 and a second rectangle 303. Rectangle 302 and rectangle 303 each corresponds to two results. The combined symbol 301 corresponds to two results since rectangle 301 and 302 have a common edge.

It is important to keep in mind that results, represented by the symbols just given as examples, can be composites of results of cells contained inside of any given cell. Therefore, any given symbol is hierarchical in nature, because it can represent many symbols which are nested in faster running variable results. The present invention which has been described as exclusively as hierarchical in nature regarding the nesting of cells can also be used as a combination of non-hierarchical cells with hierarchical cells. For instance, a single hierarchical cell can be divided into multiple subcells, with each of the subcells containing a symbol. In this case, the subcells are not necessarily hierarchically related. A vector which contains a horizontal and vertical component could have each of the components represented by its own symbol in a subcell in a main hierarchical cell. In this way, both components of the vector would be separately represented by a symbol yet combined as a vector in the hierarchical cell

containing the subcells. As with the other

representations, symbols other than rectangles can be used such as those already described above.

A hierarchical cell can be divided into any number of subcells. The division into subcells depends upon the application that is being run at the time. The subcells do not have to contain the same symbol. In this way, subcells can represent different information about a particular hierarchical symbol. Both the one hierarchical and two hierarchical method using a rectangular cell and cell arrangement, as previously described, are extremely useful in visualizing multi-variate data. However, it should be pointed out that there are many possible cell shapes and arrangements which could be of use for specific problems in multi- variate visualization (MW). A difficulty with the rectangular method can be seen when looking for trends in a result for a particular faster running cell for a set of contiguous slower running cells. The symbols

displayed in the faster running cells neighboring the faster running cell of interest clearly distract the eye from the symbols of interest which are not contiguous. One way to solve this problem is to use what is called the display tool to graph only selected subsets of rectangles. A simple example of the use of the display tool is shown in Fig. 36B, where only the first of the three faster running rectangles of Fig. 36A is shown for each of the three slower running rectangles. However, it can be easily shown that an enhancement or generalization to the rectangular cells, cell arrangement and symbols can also solve this problem.

If, for instance, for a one hierarchical axis rectangular cell arrangement, the baseline of all cells corresponding to an independent variable were rotated through some angle, while the vertical "side" walls of cells remain vertical forming parallelogram cells, then contiguous cells of faster running variables nested inside would have their origins offset in the vertical as linear function of their horizontal position. This transformation of the slower running cells would insure that all symbols corresponding to results for a given cell of the faster running cells would always have the same origin. Thus, trends in results for any given faster running variable cell for a set of slower running cells could be more easily seen even when contiguous faster running symbols were displayed. In order to further enhance the effect, the rectangular symbols themselves could have their horizontal edges, tops and bottoms, rotated through the same angle while the

vertical edges remain vertical so that they become parallelograms. This transformation of the symbol further reinforces the rotation angle and, thus, the effect of each rectangle being a "plane". A simple example of this is shown in Fig. 36C. In order to enhance the effect, the rectangle widths can be made smaller than the cell widths, thus producing a gap 361 between symbols 362, as shown by the simple example in Fig. 36D. Similarly, one can shrink the cell width to produce gaps between cells. The effect produced by the geometric cell and symbol transformations, hierarchical symbols looking like orthographic parallel planes, solves the above problem in a visually intuitive way.

The generalization of enhancements of the rectangular one hierarchical axis case are clearly designed to enhance the visualization of certain trends in the data. The angle of rotation of the cell tops and bottoms, the angle of rotation of the symbols' tops and bottoms and the width of the symbols and cells, and their placement inside of the cells could all be arbitrarily adjustable by the user. It is also important to note that there are many other possibilities for

transformations on the cells and symbols which could provide additional improvements on the visualization of data, or functions. In fact, different methods of analysis may require differing cell and/or symbol transformations. It is also important to remember that the cells and symbols are hierarchical in nature, thus many angles of rotation and symbol widths, etc., at many different levels of the hierarchy are possible, and could also be changeable by the user. In addition, the

hierarchical symbols can be any shape irrespective of the cell shape. Of course, geometric transformations of cells are not limited to the one hierarchical axis case only. In the case of a two hierarchical axis, cells can be transformed in both the horizontal and vertical

directions, which, in the case of rotations, forms cells which are themselves parallelograms. Instead of a linear collection of parallel orthographically drawn "planes", as in the one hierarchical axis case, the two

hierarchical axis case arranges the "planes" on a two dimensional matrix or lattice. Since each "plane" in general contains yet another lattice of "planes", this particular method of constructing a two hierarchical graph is sometimes called a hierarchical lattice. (in fact, a rectangular two hierarchical axis graph is a hierarchical lattice as well.) Fig. 37C shows a simple example of a hierarchical lattice while Figs. 37A and 37B indicate one way in which the display tool could produce a graph in the two hierarchical axes case. Of course, all angles of rotation, translations and sizing of symbols could be controllable by the user to best meet the needs of a given data set, function, etc. In the case of the hierarchical lattice, it is obvious that symbols of any shape can be placed in the parallelogram shaped cells. A circle is a particularly good symbol for visualizing trends because it has rotational symmetry on a two-dimensional display.

There is shown in Fig. 28 a status indicator 286 made up of four independent variable status

indicators. A status indicator shows the present state of the display for the particular independent variable represented by the status indicator. For example, status indicator 281 corresponds to the slowest running variable of a display graph. As shown in status indicator 281, all four of the independent variable values contribute to the graph being displayed.

In status indicator 282, which is the next slowest running variable, four of the six values of the second slowest running variable are contributing to the display, as shown by all six cells containing shading. In status indicator 284, there are two cells present. Note in status indicator 284, the two cells 284A and 284B are represented by two different colors (or shading as shown in the figure). This is a very useful device for displaying information which is categorical in nature, as opposed to ordinal. For instance, gender (sex) value which is common in surveys or census data may classify or arrange the data according to gender information. In cases such as this, it can be beneficial from a visual standpoint to use different colors for each of the values. By using different colors for different cells for the same independent variables, a viewer is able to more easily discern information from the graphs,

particularly when large amounts of information are being displayed. Status indicators can also serve a second role as the graphical user interface for the application of various tools used to modify the information displayed. For instance, if you wanted to use a subspace zoom, you might position a locater inside of status indicator 281, for example, in the second cell 281A, and select that cell, thereby subspace zooming to the second cell on the display. This could be repeated for any or all of the cells to zoom in or zoom out of particular parts of the display to discern different pieces of information. For instance, if one part of the display contained a feature (a grouping of symbols), the user could click on the particular cells for the various variables from the status indicators to zoom in or zoom out to that

information. For this reason, the status indicators could also referred to as status modifiers.

The status indicators can also serve as the graphical user interface for the application of other tools, such as the permute tool, where the ranking of the variables is switched or rearranged. One way this could be accomplished is by designating a particular indicator with a pointing device and moving it to a new position relative to the other indicators. As shown in Fig. 28, the variables are ranked from the slowest running

variable 281 to the fastest running variable 284 (i.e. from bottom (slowest) to top (fastest)). By rearranging the position of the status indicators, the ranking of the variables could be changed. Each of the status

indicators 281-284 is a one dimensional status indicator showing the status of single variable. It is also possible to have multi-dimensional status indicators which would show the status of collections of variables. In the case of a two-dimensional status

indicator, the information which is portrayed in two single status indicators is combined into a two- dimensional array to show combined status information. This allows the user to view horizontal status indicator information which corresponds to horizontally arranged independent variable values or cells and vertical status indicator information which corresponds to vertical independent variable values. A two-dimensional status indicator can obviously be more useful in a two

hierarchical axis cell arrangement, because the cells are arranged in both the horizontal and vertical directions. By using a two-dimensional status indicator, the user sees a position in the status indicator which corresponds 1 to 1 with the position of that cell in the hierarchical graph.

Going beyond two dimensions in the status indicator (to the extreme) results in the status

indicator being a mirror of the two hierarchical axis cell arrangement being displayed. In this way, pointing to a particular cell in the status indicator might zoom in to that particular cell on the display. Depending upon the tool being used selecting a particular cell on the status indicator could carry out that function in the corresponding cell in the display. In Fig. 38, Fig. 38A shows a set of four one-dimensional status indicators; Fig. 38B shows the corresponding set of two two- dimensional indicators; and finally, Fig. 38C shows the corresponding single four-dimensional status indicators. Status indicators of differing dimensionality can be mixed, depending upon the graph being displayed and the needs of the user visualizing the graph. For instance, a one dimensional status indicator can be combined with a two-dimensional status indicator to show the state of three independent variables. By combining different status indicators, a user could more easily and more quickly determine the state of a given graph. As already described, the user could modify the graph from status indicators used as status modifiers.

Another use for multi-dimensional status indicators is to group particular independent variables together to be displayed on the graph. In many

instances, it is more desirable to turn on certain symbols inside of cells and turn off others. This reduces the number of symbols being displayed which allows the user to see relationships between noncontiguous symbols.

In using multi-dimensional indicators, it is possible to vary the number of dimensions in the status indicator at the discretion of the user. An example where this might be necessary is when zooming into a subspace. When zooming into a subspace, it may no longer be necessary to see the state of the slower running variables than those being displayed.

The status indicators shown in Fig. 28, all show rectangles, as for this particular example, those are the shapes of the symbols which could be displayed in the corresponding graph. It is equally possible to have the status indicator cells correspond to a different symbol shape if the symbol shape being display was other than a rectangle. For instance, if a circle was being displayed as the symbol, it is possible to have the status indicator cells in a circular shape. In this case, each cell in the status indicators could be a circle and show shading to represent the position being displayed on the graph. It is also possible to have status indicator cells which do not correspond to the shape being displayed. Choices such as this would be made to better represent and interpret the information which is being displayed. As with the other tools, the status indicators could be changed as the data is being viewed and interpreted to enhance the perception of the data being displayed.

The status indicators are also useful for selecting groupings of variable values to be used in displaying slower running variables. For instance, if it was desired to see the age distribution for males only and the age distribution for females only where age was a slower running variable than sex, as indicated in Fig. 28, it would be possible to select the male cell such that only males were included in indicator 284 thus, propagating through the results associated with the hierarchical symbols producing an age distribution of males only.

In addition to the status indicators, there is shown in Fig. 28 a currently displayed cells indicator 285. Indicator 285 shows which independent variable cells that are actually displayed are slowest running, next slowest running and so forth. A similar currently displayed cells status indicator could also extend in the vertical direction, as in the case of a two hierarchical axis arrangement. In addition to showing which displayed

variables run slowest or fastest, a currently displayed cells status indicator also shows where the result corresponding to a independent variable value resides on the graph. In 285, the independent variable values are all shown in ascending order from left to right.

As with the other status indicators, the currently displayed cell status indicators can be used as a status modifier with tools such as subspace zoom or permute, etc.

Another way to represent completely random data in the present invention is to first bin the data so that it can be represented on a grid of bins. In binning, the data is classified according to a predetermined scheme. For instance, age data can be binned into groupings of one to ten years old, eleven to twenty years old, twenty- one to thirty years old, etc. It is clear that other methods of arranging age data can also be used. By binning, data that is otherwise sparse or uneven in intervals can be organized into a data set which has even intervals. Using the technique of the present invention, it is desirable to display data which is evenly spaced. Therefore, binning is one way to provide evenly spaced data from absolutely random data which otherwise would be difficult to display because it does not have an

underlying even interval (lattice-like) spacing.

Binning can also be done dynamically, where bin data is rearranged into coarser or wider bins. For the example using age data given above, the categories can be reduced to fewer bins covering larger spans of years. If binning is done as a pre-processing stage, then further binning, done dynamically, would only allow for a

reduction in the number of bins. This reduction in the number of bins could be by factoring, because reducing the number of bins involves grouping previously binned information into larger bins. As a result, if an

original number of bins was a prime number, they cannot be further reduced into a whole number of wider bins because an extraneous narrow bin or bins would be left. This is not a restriction of the current invention, since such extraneous bins could be a valid representation of the data. Another way to deal with these extraneous bins would be to discard them. Reducing the number of bins decreases the number of cells but may allow the viewer to see more variables at one time. While viewing the space with the fewer number of bins, certain correlations or features of the data may be visualized and isolated. At this time, the narrower bins can be reinstated to show more detail.

Like the other tools, dynamic binning can be controlled directly from a status indicator. In this case, it is useful to modify the status indicators to show what level of binning is presently being displayed. Fig. 39 shows three possible ways (392, 393, 394) of rebinning a variable that originally has sixteen bins (391) and demonstrates how the level of binning would be indicated by the status indicators. The binning tools allows a user to see an entire n-dimensional space on the computer display even though the original number of fastest running cells exceeds the maximum number that can be displayed i.e., when the number of pixels either totals or is fewer along a display axis than the number of cells desired to be displayed. The use of hierarchical symbols to represent data over multiple subspaces serves many useful data analysis purposes.

A legend can be displayed to label the various hierarchical symbols being displayed. Legend information concerning a given symbol can be scale information, color information, the number corresponding to the results (in numerical form), etc. Also, information pertaining to the symbol or describing the symbol can be included in the legend. A further embodiment of the present invention involves the fast plot of data using either a one

hierarchical axis or two hierarchical axis method.

Consider the case where one or more of the independent variables have a very large number of values. These values can be either bins or lattice points. In this case, the data may be re-binned in a fashion somewhat different than has previously been discussed. This type of re-binning involves the creation of two or more variables to represent the original variable. The new variables are called "spawned" variables. This is indicated in Fig. 40 which shows status indicators for the simple case of replacing a variable A, having 16 values (401), by two variables A' (402) and A" (403), each having four values. Of course, we are generally interested in variables with a far greater number of values.

For example, if originally a variable has one million (1,000,000,000) bins, one can create groups of wider bins, such as one thousand (1000) bins containing one thousand (1000) smaller bins each. In this way, the one thousand (1000) bins containing one thousand (1000) bins each replaces the one variable with one million bins by two variables each having one thousand (1000) bins. In effect, a hierarchical nesting relationship.

Alternatively, three variables can be created to replace the one million (1,000,000,000) bins with each variable containing one hundred (100) bins. This provides an even more coarse look at each level. It is also easier on computer overhead as fewer number of symbols have to be displayed at any one time. From the user's perspective, the user can sort through the coarser display of symbols to find locations in the graph where a more detailed visualization is necessary, thus speeding up the

exploratory process. With large sets of data it is common to have a majority of the data be "uninteresting" with only a few regions being of interest. Using the fast plot technique, the data can be scanned through successively finer graphs until the user is able to pinpoint the regions containing data of interest.

There shown in Fig. 31 a flow chart for the draw suppression tool 310. Draw suppression tool 310 allows a hierarchical symbol to be turned on or off. In draw suppression tool 310, a group of symbols is selected in block 311. The symbol or group of symbols selected is then turned on or off in block 312. There is also shown in Fig. 31 symbol color tool 313. Symbol color tool allows any hierarchical symbol to be assigned a color. In assigning a symbol a color, the symbol is displayed on screen in that color. Symbol color tool 313 begins with block 314 where a symbol or group of symbols is selected. Processing then goes to block 315 where a color or colors for the selected symbols are selected. Please note that a symbol is not limited to having one color as multiple

information can be represented by a single symbol. In block 316, the symbol or group of symbols is displayed in the selected color.

There is also shown in Fig. 31 a flow chart for manual scaling tool 317. Manual scaling tool 317 allows for setting the scale or scales for displaying

hierarchical symbols. In the manual scaling tool 317, a symbol or group of symbols is selected for scaling in block 318. In block 319, an appropriate scale factor or factors is entered. The scale factors can be the end points of the scale. For instance, a y-axis scale can run from 0 to 10 or 0 to 100, etc. In block 320, the symbols are displayed using the new scale.

There is shown in Fig. 32 a flow chart for the grid-line on/off tool 321. Grid-line on/off tool 321 merely turns on or off the grid lines which are displayed on screen. Grid lines can be used to show cell

boundaries.

There is also shown in Fig. 32 a hierarchical cell status indicator on/off tool 322. Hierarchical cell status indicators can be turned on or off, depending on whether the user wants this additional information displayed on scene at any particular time.

Black and white/color toggle tool 323 allows the user to switch between black and white representation and color representation of the symbols and other

information displayed on scene. Also shown in Fig. 32 is mid-line tool 324. Mid-line tool 324 turns a symbol bisector on or off.

Symbol bisectors are used to show additional information about the symbols on the display scene. For instance, if one is plotting a rectangle with the lower boundary being one standard deviation of the mean below the mean and the upper boundary being one standard deviation of the mean above the mean, the mid-line would represent the mean. Bisectors could be used for other applications where it is beneficial to show a bisection of the symbol. This could take place at the middle or at another location in the symbol. Also shown in Fig. 32 is outline tool 325. Outline tool 325 turns hierarchical symbol outlines on or off. Hierarchical symbol outlines are used to emphasis the uniqueness of the symbol by graphically setting it apart from other symbols.

Also shown in Fig. 32 is rendering direction tool 326. Rendering direction tool allows hierarchical symbols to be drawn in any order. In rendering direction tool 327, an order is selected for graphically displaying the symbols. This is accomplished by selecting symbols one by one on screen or typing in an order for the symbols or by selecting all symbols in a given

independent variable for display ahead of other

independent variables. In block 328, the symbols are displayed in the order selected in block 327.

There is shown in Fig. 33 a flow chart of global symbol toggle tool 331. Global symbol toggle tool turns a global hierarchical symbol on or off. Also shown in Fig. 33 is independent to dependent variable swapping tool 332. Independent to dependent variable swapping tool 332 allows swapping of an independent variable to dependent variable status. This tool allows the user to change an independent variable to a dependent variable. This tool begins at block 333 where one of the current independent variables to become a dependent variable is selected. In block 334, the new symbols are displayed using the new

dependent variable.

Also shown in Fig. 33 is hard copy tool 335. Hard copy tool 335 allows screen representations to be output to a hard copy device or desk top publishing program. An example of a hard copy device would be a printer or plotter.

Also shown in Fig. 33 is interrogate tool 336. Interrogate tool 336 allows numbers represented by hierarchical symbols and or cells to be displayed, printed, or stored in another location. This tool begins at block 337 where a symbol or group of symbols to be interrogated is selected. Processing then continues to block 338 where the information from the selected symbol or symbols are printed or displayed.

There is shown is Fig. 34 cell/symbol suppression tool 341. Cell/symbol suppression tool allows hierarchical symbols to be displayed. According to some criteria, for instance, symbols which are too small to be displayed on the current display could be suppressed by turning them off. Cell/symbol suppression tool 341 begins in block 342 where a symbol or symbols which cannot reasonably be displayed given the current status of the display are selected. Next, in block 343, the graphic display is updated without the selected symbols.

Also shown in Fig. 34 is drawn attribute tool 344. Drawn attribute tool 344 allows line styles, thickness, area cover, fills and other attributes of the display to be set. In drawn attribute tool 344, graphic attribute to be modified is selected in 345. The

selected attribute is changed in 346. Finally the graph is displayed using the new attributes in 347. Also shown in Fig. 34 is symbol representation tool 348. Symbol representation tool 348 allows

hierarchical symbols to be redefined at any time. Symbol representation tool 348 begins at 349 whereas a symbol or a group of symbols for re-representation is selected. In block 350, a new representation is assigned to the selected symbol or symbols. In block 351, the modified symbol or symbols are displayed.

There is shown in Fig. 35, cell transformation tool 352. Cell transformation tool allows hierarchical cells to be reshaped, rearranged or sized at any time.

In block 353, a cell or group of cells for transformation is selected. In block 354, the selected cells are transformed. In block 355, the symbols for the new cells are displayed. Also shown in Fig. 35 is symbol transformation tool 356. Symbol transformation tool 356 is similar to the cell transformation tool 352 except that symbols can be reshaped, rearranged, or sized at any time. In block 357, a symbol or group of symbols for a transformation is selected. In block 358, the symbol or group of symbols is transformed. In block 359, the transformed symbol or group of symbols is displayed.

The technique of spawning two or more variables to replace one variable or invoking another instantiation of an "MGTs" graph corresponding to a different variable ordering or variable binning of a zoomed subset of variables, etc. (that is any MGTs that can result from the application of any of the tools) can produce a large number of important MGTs macros. Consider, for example, the simple case of a 1 hierarchical axis (HA) technique which one wishes to map to a 2 HA technique by spawning the last variable into say two variables, the slower to be plotted as a single variable on the vertical of a 2 HA MGTs. Fig. 41A shows the original 1 HA schematically with 16 values of the slowest running variable (here only the largest 16 cells are shown - not data). Only a small fraction of the computer monitor is used and several faster running variables may be suppressed due to pixel limitations while increasing the vertical extent of each cell many not be called for. Fig. 41B shows the result of spawning this slowest running variable into two variables, one vertical and one horizontal, each with 4 values to create a 2HA plot while maintaining the HS rendering used in the original 1 HA plot. Returning to the most general definition of the macro tool, it can best be understood by noting that a language, such as a macro language, can be used that will allow the user to create any number of MGTs graphs from any number of data sets (limited only by system memory). And to position and size these graphs and to address any graph and duplicate it and/or apply any of the tools to it once or successively.

The macro tool (ultimately invoked via the scripting of macro language) can create "dead" clones that is images of an MGTs graph which are not themselves active MGTs graphs, i.e., they can be moved and sized and killed but no other tools e.g. animate, decimate,

subspace zoom, etc. which require an active affiliated MGTs program running, can be used on the clones. It should be noted that since one of the MGTs tools kills an MGTs graph and another kills an MGTs clone, the macro language can be used to present a multi- variable data visualization and analysis "slide show" - hence, it is referred to as a macro scripting language. Earlier, the expander tool was described as showing how a single dependent variable changed when moving away from a selected point in the multidimensional independent variable space in the direction of each of the independent variables. This tool is patterned after the 1HA method in that the dependent variable is plotted in the vertical direction while all the possible displacements in the multiple independent variables are along the horizontal. However, these displacements are not hierarchical. That is, the horizontal displacement along any independent variable direction (indicated by the color of a line connecting the new point to adjacent points) of one unit is

displayed as one graphical unit on the computer monitor.

The expander tool has been generalized to show all possible paths from a point where all independent variables take on their minimum values (A_min, B_min, C_min . . . . ) to a point where they all have their maximum values (A _max , B_max, C_max • • • )•

The procedure automatically generates all possible paths between any two points on the lattice.

Hence, the new tool is called the "all paths" tool. Fig. 42 shows a simple example of this type of "all paths" tool for the case of three variables each with two values. Fig. 42A show a one hierarchical axis min/max MGTs graph while Fig. 42B shows the corresponding "all paths" display. A second type of "all paths" tool is shown in Fig. 43 for the simple case of four variables each with two values. Here displacements along the A,B,C and D variable directions are represented by vectors in the horizontal/vertical plane of a monitor. A,B,C and D are represented as having the same horizontal component but they have differing vertical components. The value of the dependent variable is indicated by the area or radius of circles at the ends of these vectors as shown. Another type of "all paths" tool is shown in

Fig. 44 for the simple case of four variables each with two values. The dependent variable value is indicated by the area or the radius of the circles at the end of the vectors as in the last example. This type of all paths tool has two vectors in the horizontal direction and two vectors in the vertical direction with all vectors being of the same length.

The three types of "all paths" tools described above (in Figs. 42, 43 and 44) are not limited to a small number of variables or variable values nor are they limited to these particular representations of

displacements and dependent variable values.

In Fig. 44, if two concentric circles present then represents 2 points in 4 d space with in general 2 different values of dependent variable for e.g.

e.g. the point α can be (1,1,2,1) or (2,1,1,1,) while β can be (1,2,1,1) or (1,1,1,2) etc. Similarly, 4

concentric circles means that 4 points are represented. For example, ε can be (1,2,2,1) or (2,1,2,1) or (1,1,2,2) or (2,2,1,1) etc.

The "all paths" graphs that is the graphs that result when one applies an "all paths" tool are to be considered to be MGTs graphs even though they are not hierarchical in nature. Like any other MGTs graph, many other tools may be applied as appropriate to an "all paths" graph for example the zoom tools, the decimate, display, permute, animate and clone tolls etc.

Similarly, "all paths" graphs may be created as

appropriate from any MGTs graph whether it involves only conventional variables or also includes meta-variables.

Because it may be difficult or unclear how to categorize variables as "dependent" or "independent" and, moreover, it may be difficult to know, in general, which variables are "relevant" and which are "irrelevant" it is useful to have a variety of tools that address these issues. Shown in Fig. 45 is a schematic representation of one example of what is called the overview array tool. The data for Fig. 45 consists of four variables A, B, C, and D, which are continuous in nature. For simplicity, each variable is binned into a small number of bins.

Five bins are shown in Fig. 45. The mean, for instance, of each variable when a second variable is constrained to be in one of its 5 bins is then found. Then, a plot is made of the mean of the first variable plus or minus the standard derivation of the mean versus bin number for the second variable for all 4 x 4 = 16 possible cases i.e. 4 choices for the variable for which the mean is being found, namely, A, B, C and D, and 4 choices for the binned variable to be plotted horizontally. In addition, it is useful to also plot N which is the total number of data points in each bin in a new row located below the <D> row.

The plot shown in Fig. 45 involves plotting the mean of one variable subject to constraining the same or another variable to be in a certain bin. The overview array tool can be generalized to plot not only these means but also means that correspond to constraining not one but two variables, i.e. each of the two variables is constrained to be in one bin of its respective set of bins. For example, the mean of A subject to B being in bin B₃ and C being in bin C₂ can be found. Similarly, 3 or even all 4 variables can be constrained to be in a particular set of their respective bins, e.g. A in bin A₂, B in bin B_1' C in bin C₅ and D in bin D₃.

Fig. 46 shows one possible layout of the array when two variables are constrained to specific bin sets, but for a lesser number of bins and variables, namely, just 3 bins for each of three variables.

It is useful to define yet another abstract type of independent variable which are called the "permutation variables". Consider, for example, the case of having three variables A, B and C. A set of

permutation variables can be defined that allow looping over all possible permutations of A, B, and C. In the case of three variables, the set of permutation variables consists of 2 variables. For example, a permutation variable can be defined that determines which of the three variables is slowest running. It has 3 values namely, A, B, and C. Next, a variable is defined that determines which variable is second slowest running. It has 2 values which depend on the value of the slowest running permutation variable. If, for instance, the slowest is A then the two values for the second slowest variable are B and C, etc. The permutation variables are only two in number in the case of 3 variables A, B and C because there is only one possible choice for the third slowest variable (or the fastest in the case of 3

variables) once the slowest and second slowest have been selected. For example, if A is slowest and B is second slowest, C must be the fastest. Clearly, the permutation independent variables can be defined for any number of variables with their number of values being N, N-1, N-2, ... 1 for the slowest permutation variable to fastest respectively or vise versa. The permutation variables differ from other variables in that they cannot be permuted amongst

themselves. That is the fastest running permutation variable, if present, must be the fastest of the set of permutation variables, the second fastest running

permutation variable, if present, must be slower than the fastest but faster than the remaining permutation variables, etc. However, not all permutation variables need be present, for instance, if only the slowest running variable is present the default ordering for the second slowest, third slowest, etc. is that they are in cyclic order for each selection of the slowest. For example, for the case of three variables A, B, and C, if A is slowest and no second slowest permutation variable is present then the second slowest is automatically set to be B and the third slowest is set to be C. Another approach to the array presentation of, for example, the means of the variables (or sums etc.) is to form a collection of MGTs graphs as indicated

schematically in Fig. 47 for the case of three variables A, B, and C. Here, each of the 9 rectangles represents a

1HA MGTs graph for the mean of A (first row of MGTs graphs) or of B (second row) or of C (third row) while each column corresponds to a different selection for the slowest running variable with the second and third slowest being in cyclic order. Hence, this composite array of 1 HA MGTs graphs can be thought of as consisting of a single 2HA MGTs graph where a DVS variable is on the vertical and the slowest running horizontal axis variable is the slowest running permutation variable (the second and third slowest running permutation variables simply being in cyclic order), even though the rendering scheme used may be one that is normally associated with the 1HA method. Since some array tools create a new MGTs graph, many other tools as appropriate may be applied to an MGTs graph that has been created by an array tool including zoom tools , the decimate, display, permute, animate and clone tools etc. Similarly, "array" type MGTs graphs may be created as appropriate from any MGTs graph whether it involves only conventional variables or also includes meta-variables.

The arrays shown in Figs. 45-47 can be generalized to show not only the mean of each variable along the vertical but also the sum and/or the standard deviation etc. This can be done utilizing the existing techniques if the mean, the sum, the standard deviation, etc. are referred to as operations, and an "operation" selection variable is created that can then take on two or more values defined by the user.

It should be noted that when the DVS variable is the fastest running variable on the vertical one can view Figs, such as Fig. 47 as 1 HA MGTs graph with three dependent variable represented as a vertical column of three rectangular cells.

Similarly, if the DVS variable were the fastest running horizontal variable the resulting graph is equivalent to a three dependent variable 1 HA MGTs graph with the multiple dependent variables represented as a horizontal row of three rectangular cells. This, of course, is also true for any operation other than the mean and similar statements hold for non- rectangular cells.

All of the array like plots described can be made using circles or rectangles renderings normally associated with 2HA methods.

The non-conventional variables described above, namely the dependent variable selection variable (DVS), the permutation variable and the operation variable, can be referred to as meta-variables. These meta-variables point to, permute or select an operation to be performed on conventional variables. More generally, a metavariable can be thought of as a variable characterizing any operation on an original data set or sets which arranges or re-arranges, associates or re-associates, combines or re-combines, classifies or re-classifies the data or any or all subsets of the data or in any other way organizes or re-organizes the data or which selects any subset or subsets of the data or which defines new data based on the original data by the application of standard mathematical operations or which results from applying any, all or any subset of said operations once or repeatedly. Thus, the act of performing an operation on a conventional variable can be thought of as

generating a new conventional variable which generally will belong to or be defined over a smaller subspace than the original conventional variable. For example,

performing the sum of a dependent variable V over all possible lines of bins along the A variable direction in an A,B,C, space produces a new variable, namely

Σ V(B,C,D) from (A,B,C,D). V belongs to, is defined over and depends on the variables in a smaller space, namely the B,C,D subspace. Variables that are derived from one or more operations over one or more subspaces are called generated variables.

Generated variables, along with the

conventional variables, may be selected to be included in the DVS variable and/or may be entered as independent variables to be used with any of the one hierarchical axis or two hierarchical methods or their generalizations when such an inclusion leads to a logically consistent and mathematically well-defined system.

Another possible meta-variable is the

rebinning variable. The rebinning tool, as described above, changes the width of the bins which are used for each of the conventional variables. A meta-variable can be defined which can be viewed as splitting one or more conventional variables (and where appropriate, non- conventional variables) into the number of variables set by the value of a rebinning variable. For example, consider a variable that has been assigned N bins, where N_R is an integer. Since N_R can always be written as a unique product of prime numbers, it has a unique

representation

N_R = P_NR1, P_NR2 . . . P_NRn = P_NRi

where P_NRi = the ith prime number (with the prime numbers in ascending order) in the product of primes equal to N_R; and

n = the number of prime factors in the product of primes equal to N_R. Thus, the new rebinning variable has the values of 1, 2, 3 . . . N. The first value, 1, means do not rebin. The second value, 2, means decompose the N_R variable, so that it is now represented as two variables. The first of the two variables having NR'2 bins and the other with N_R"2 bins, where NR = N_R'₂ x N_R"2.

Accordingly, it is understood that the present invention has been described by way of illustration and not limitation. That is, both N_R,₂ and N_R"2 are products of

P_NRi' where the set of P_NRi belonging to the product equal to N_R'2 and the set of P_NRi belonging the N_{R" 2} are distinct with no equal members, and the collection of the two sets contain all P_NRi in the N_R prime product

representation. The number of ways of forming these two sets is

Hence, a user must either specify a new meta-variable which specifies which subcases will be displayed, or one invents a simple default convention for the case where the new meta-variable is absent. In the latter case, the convention will be two divide the set of P_NRi (in

sequence) into two equal sets if n=even, or into a first set with

aand a second set with

if n=odd, with similar rules for the cases where the variable is to be split into three, i.e., rebinning variable value 3 or 4, etc. The advantage of inventing the rebinning metavariable is that one can then, in effect, loop over all possible possibilities, for bin sizes which are

consistent with the original numbers of bins chosen for each variable and, hence, can, in principal, also loop over the generalized animations, etc. The rebinning task is made far more straight forward if each of the variables is initially binned so that the number of bins is a power of 2 or 3 or 4, etc. For example, if a variable is binned initially to 16 bins, then

N_R = 16 = 2⁴ = 2*2*2*2.

There is shown in Fig. 30 a flowchart of an embodiment of the invention. In block 501, variables are chosen as a subset of a larger collection of variables or to define a function etc., to be the independent

variables, driving the hierarchy, and the dependent variable or variables, to determine the results for visualizing.

In block 502, the independent variables are ranked into a predetermined hierarchy. Fastest running independent variable to slowest running independent variable. The slowest running variable defining the most complex subspaces and the fastest running variables the simplest one-dimensional subspaces.

In block 503, the hierarchical cells are created and assigned, in a nested fashion, to all of the subspaces generated by the ranked independent variables. In addition, a result or results are generated and assigned to each created cell. These results are derived from the dependent variable values, either from data or function, and in general correspond to the independent variable values associated with the cell to which they are assigned.

In block 504, a symbol type is chosen to represent the result or results in each cell. In block 505, a hierarchical symbol is

constructed for each cell. Hierarchical symbols can be constructed from the "independent" and/or "dependent" variables or from fractions, or can be whole MW graphs. The term hierarchical symbol defines a symbol which can be constructed as a composite of other scholar or related symbols (for example, a rectangle which "contains" other rectangles).

In block 506, a hierarchical symbol is assigned to each cell. In block 507, it is determined whether or not a symbol will be displayed and in which order.

In block 508, the appropriate symbols are drawn in the specified order.

The terms "variables" and "data" in the context of the "dependent" variable or variables that determine the detailed nature of the hierarchical symbols in any given embodiment of the invention and in the context of the "independent" variables that are related to the hierarchical cells and axes of any given embodiment of the invention are not confined to mean numeric data, whether empirically derived or not but also include all manner of information whether numeric or text or

otherwise that can be mapped to or associated with numeric data either directly or by reference including but not limited to categorical information, bodies of information and their relationships, systems and all manner of mathematical entities and constructs including functions, functionals, sequences, sets, groups, rings, algebras, trees, graphs, manifolds and systems of logic.

The terms "dependent" variable (s) and "independent" variable (s) are used merely as convenient ways to distinguish sets of numbers that are used by the various embodiments of the invention to determine attributes of hierarchical symbols and hierarchical cells/axes respectively.

Those skilled in the art will immediately recognize the utility of the present invention in the areas of graphing and data/function analysis. While preferred embodiments have been described, various modifications and substitutions may be made without departing from the spirit and scope of the invention.

Claims

What is Claimed: 1. A system implemented on a digital computer for displaying a function in two dimensions defined by an X-axis and a Y-axis, wherein said function is comprised of a plurality of independent variables and at least one dependent variable, each independent variable having at least one associated value, comprising: (a) means for reading the values associated with the independent variables; (b) means for selecting which independent variable is the fastest running variable and ranking it as the fastest running variable; (c) means for selecting which independent variable is the next fastest running variable and ranking it as the next fastest running variable; (d) repeating step (c) until all independent variables are ranked with the last ranked independent variable being the slowest running variable; (e) means for constructing a structure of cells, said means comprising: 1. means for defining a fastest running cell by assigning a cell to each value of the fastest running variable for the first values of all remaining independent variables;

2. means for arranging the fastest running cells according to a first predetermined grouping; 3. means for assigning a next fastest running cell which contains the first predetermined grouping; 4. means for assigning a new cell to each value of the fastest running variable for the second value of the next fastest running variable and the first value of all remaining independent variables; 5. means for arranging a second new cell which contains the second predetermined grouping said second new cell corresponding to the second value of the next fastest running variable and a first value for all remaining independent variables; 6. means for repeating elements (e) (4) through (e) (5) for each value of the next fastest running variable, wherein said second value of the next fastest running variable is replaced by each remaining value of the next fastest running variable, and the second new cell is replaced by another new cell and the second predetermined grouping is replaced by another predetermined grouping, sequentially, until completed for the last value of the next fastest running variable; 7. means for repeating elements (e) (1) through (e) (6) for each value of any remaining

independent variable, wherein said first value of all remaining independent variables is replaced by each value of the next ranked independent variable, sequentially, and the first value of all remaining independent

variables until completed for a last value of the slowest running variable; (f) means for determining a result for each cell assigned in elements (e) (1) through (e) (7); (g) means for displaying a symbol for at least one of said cells, said symbol relating to said result for said cell. 2. The system of claim 1 or 21 wherein a symbol is displayed for a selection of said cells.

3. The system of claim 1 or 21 wherein said symbol is displayed in color.

4. The system of claim 3 wherein each color represents a different independent variable.

5. The system of claim 2 further comprising, means for selecting which of said symbols for said selection of cells is displayed first.

6. The system of claim 5 further comprising means for selecting an order in which any remaining symbols are displayed.

7. The system of claim 1 or 21 wherein there is more than one new independent variable, each of said more than one independent variables corresponding to a set of dependent variables and each more than one

independent variable having an associated new value relating to a property of said corresponding dependent variable.

8. The system of claim 1 or 21 wherein said symbol displayed for at least one of said cells

represents a summation of the results determined for all of the cells contained in said at least one of said cells.

9. The system of claim 1 or 21 wherein said symbol displayed for at least one of said cellsrepresents a mean of the results determined for all of the cells contained in said at least one of said cells.

10. The system of claim 1 or 21 wherein said symbol represents a maximum of the results determined for all of the cells contained in at least one of said cells.

11. The system of claim 1 or 21 wherein said symbol represents a minimum of the results determined for all of the cells contained in at least one of said cells.

12. The system of claim 1 or 21 further comprising: (a) means for selecting a cell; and (b) means for displaying symbols for only said selected cell.

13. The system of claim 1 or 21 further comprising: (a) means for selecting a group of cells; and (b) means for displaying symbols for only said selected group of cells.

14. The system of claim 1 or 21 further comprising: (a) means for selecting at least one independent variable value for an independent variable; (b) means for redefining said independent variable to have only said selected value; (c) means for displaying symbols relating to said selected value.

15. The system of claim 14 wherein more than one independent variable value is selected.

16. The system of claim 1 or 21 further comprising means for rearranging the rankings of the independent variables to provide a new ranking of independent variables from fastest to slowest.

17. The system of claim 1 or 21 further comprising means for regrouping the cells according to a new predetermined grouping.

18. The system of claim 1 or 21 wherein said cells have a predetermined shape and said shape is modifiable.

19. The system of claim 1 or 21 wherein a new result is determined for at least one cell.

20. The system of claim 1 or 21 wherein a new symbol is displayed for at least one of said cells, said new symbol relating to said result for said cell.

21. A system implemented on a digital computer for displaying data in two dimensions defined by an X- axis and a Y-axis, wherein said data is comprised of a plurality of independent variables and at least one dependent variable, each independent variable having at least one associated value, comprising: (a) means for reading the values associated with the independent variables; (b) means for selecting which independent variable is the fastest running variable and ranking it as the fastest running variable; (c) means for selecting which independent variable is the next fastest running variable and ranking it as the next fastest running variable; (d) repeating step (c) until all independent variables are ranked with the last ranked independent variable being the slowest running variable; (e) means for constructing a structure of cells, said means comprising:

1. means for defining a fastest running cell by assigning a cell to each value of the fastest running variable for the first value of all remaining independent variables; 2. means for arranging the fastest running cells according to a first predetermined grouping; 3. means for assigning a next fastest running cell which contains the first predetermined grouping; 4. means for assigning a new cell to each value of the fastest running variable for the second value of the next fastest running variable and the first value of all remaining independent variables; 5. means for arranging a second new cell which contains the second predetermined grouping said second new cell corresponding to the second value of the next fastest running variable and a first value for all remaining independent variables; 6. means for repeating elements (e) (4) through (e) (5) for each value of the next fastest running variable, wherein said second value of the next fastest running variable is replaced by each remaining value of the next fastest running variable, and the second new cell is replaced by another new cell and the second predetermined grouping is replaced by another predetermined grouping, sequentially, until completed for the last value of the next fastest running variable;

7. means for repeating elements (e) (1) through (e) (6) for each value of any remaining

independent variable, wherein said first value of all remaining independent variables is replaced by each value of the next ranked independent variable sequentially, and the first value of all remaining independent variables until completed for a last value of the slowest running variable; (f) means for determining a result for each cell assigned in elements (e) (1) through (e) (7); (g) means for displaying a symbol for at least one of said cells, said symbol relating to said result for said cell.

22. A system implemented on a digital

computer for displaying a function in two dimensions defined by an X-axis and a Y-axis, wherein said function is comprised of a plurality of independent variables and at least one dependent variable, each independent

variable having at least one associated value,

comprising: (a) means for establishing the values associated with the independent variables;

(b) means for selecting which independent variable is the fastest running variable and ranking it as the fastest running variable;

(c) means for selecting which independent variable is the next fastest running variable and ranking it as the next fastest running variable; (d) repeating step (c) until all independent variables are ranked with the last ranked independent variable being the slowest running variable;

(e) means for constructing a structure of cells, wherein each cell of said structure of cells has a ranking and each cell of said structure of cells has a set of values of a set of independent variables

associated with said each cell;

(f) means for determining a result for each cell of said structure of cells; and

(g) means for displaying a symbol for at least one of said cells of said structure of cells.

23. A system implemented on a digital computer for displaying a function in two dimensions defined by an X-axis and a Y-axis, wherein said function is comprised of a plurality of independent variables and at least one dependent variable, each independent variable having at least one associated value, comprising:

(a) means for establishing the values associated with the independent variables;

(b) means for ranking said independent variables form fastest to slowest;

(c) means for constructing a structure of cells, wherein each cell of said structure of cells has a ranking from fastest cells through to slowest cells with the faster running cells contained within the slower running cells and each cell of said structure of cells having a set of values of a set of independent variables associated with said each cell;

(d) means for determining a result for each cell of said structure of cells; and (f) means for displaying a symbol for at least one of said cells of said structure of cells.

24. A system implemented on a digital computer for displaying a function in two dimensions defined by an X-axis and a Y-axis, wherein said function is comprised of a plurality of independent variables and at least one dependent variable, each independent variable having at least one associated value, comprising: (a) means for defining a rendering of at least one step on an output device, wherein said step

represents a unit increase of a value of one of said independent variables; (b) means for defining a rendering of at least one symbol, said symbol having a size and a location, said symbol representing a result; and (c) means for displaying said symbol and said step, wherein said step is associated with said symbol.