Discriminant analysis visualization for scatterplots
Submitted by Luke Hutchison
It would be very useful if Gnumeric could display not just trend lines on graphs, but discriminant lines that separate two or more different classes of points. For example, if you add one series of XY points that correspond a cluster of items in one class and a second series of XY points that correspond to a cluster of items in a second class, and the points are supposed to be linearly separable, it would be very useful to be able to see the decision boundary between the two classes plotted on a scatterplot.
(1) In the simplest case, simple linear discriminant analysis could be performed and the linear decision boundary between the points could be shown. Being able to see the result of Fischer discriminant analysis or of running an SVM classifier or logistic regression (without having to export data into R or something) would be extremely useful for many applications, and would be an excellent addition to the current trend line support, only requiring the implementation of the discriminant analysis, but without requiring much in the way of architectural changes to graph plotting.
(2) In the more complex case, with more than two classes of points and non-linear boundaries (e.g. with three classes and quadratic decision boundaries), rather than drawing a decision line between the classes, the plot background could be colored differently in the regions of the domain that fall within each class. This would also be extremely useful, but would probably be trickier to implement with Gnumeric's current architecture than the above, and may fall more in the realm of the sort of thing you would be trying to do with R in the first place -- so I'll just add (2) as a "blue-skies wishlist item", although (1) is definitely something that a lot of people could use in Gnumeric.