Talk:Heat map

From Wikipedia, the free encyclopedia

[edit] Bioinformatics?

Heatmaps shouldn't be considered a bioinformatics topic. Rather, they are a general data visualization technique. Bioinformatics is merely one application.

Any thoughts on changing it?

I completly agree with him, as heatmaps are used for a wide variety of purposes, for example to illustrate repartition of stress (physics), heat, and so on. —Preceding unsigned comment added by 88.160.174.38 (talk) 06:33, August 30, 2007 (UTC)

I agree somewhat. There are several things to keep in mind, however. First, there are two different types of graphs commonly referenced under this term. The first is the one pictured in this article and widely used in genomics. It consists of a rectangular array of colored pixels representing a matrix. The rows and columns of this array are often permuted to show blocks of similar values. Wilkinson, The Grammar of Graphics, 2nd ed. (Springer, 2005) discusses the history of this graph. One of the earliest examples of a heatmap used for this purpose is in Sneath, P.H.A. (1957). The application of computers to taxonomy. Journal of General Microbiology, 17, 201–226. A famous reference is in Bertin, J. (1967). Sémiologie Graphique. Paris: Editions Gauthier–Villars. English translation by W.J. Berg as Semiology of Graphics, Madison, WI: University of Wisconsin Press, 1983. Adding the cluster trees to the margins of the graph was invented by Ling, R.F. (1973). A computer generated aid for cluster analysis. Communications of the ACM, 16, 355–361. The first computer program for making one (in black and white, using overstruck characters) was programmed by John Hartigan and Bob Ling for the BMDP computer program (2-way clustering). The first appearance of the graph shown in the figure of this article was in SYSTAT, Version 5 (1987). SYSTAT, because it had high-resolution color output, was able to use a heatmap color scale (the same as in this article example) and horizontal/vertical cluster trees, which were unavailable in the BMDP teletype output. SYSTAT was widely used in the 1980's and early 1990's by biological researchers, and the SYSTAT design for the graph found its way into several computer programs. The most widely used in bioinformatics was developed by Michael B. Eisen, Paul T. Spellman, Patrick O. Brown, David Botstein, Cluster analysis and display of genome-wide expression patterns Proc. Natl. Acad. Sci. USA Dec 08, 1998; 95: 14863-14868. (This is one of the most widely downloaded articles from PNAS).

The other type of graph commonly called a heatmap is actually a treemap. This display does not have rows and columns. Instead, it consists of a recursive partitioning of rectangles governed by a tree-building algorithm (cluster analysis or some other). The cells are colored similarly to the permuted heatmap display. The first published example of this graph is in Johnson, B.S. and Shneiderman, B. (1991). Treemaps: A space-filling approach to the visualization of hierarchical information structures. Proceedings of the IEEE Symposium on Information Visualization, 275–282. Its most popular incarnation is the Wall Street Smartmoney Map (http://www.smartmoney.com/marketmap/mapPage.cfm). Wikipedia already has an entry for this graph, oddly under the name Treemapping. (I don't advocate Heatmapping for this entry).

It would be unfortunate if Wikipedia confused the two types of graphs. Although they look similar, they have completely different underlying models, different motivations, and different interpretations. If the permuted-array and treemap are both called heatmaps, then a map of the US with counties colored by average temperature or the Wilkinson Anisotropy Map (http://apod.nasa.gov/apod/ap050925.html) should be called heatmaps as well. There is nothing instrinsically rectangular implied by the term Heatmap.

Therefore, I would advocate a different approach. Define a heatmap as a (usually 2D) area graphic (map) whose regions are colored using a color scale to represent the values of a variable. Mention a very old example from the early 1800's (see . Then point out the two most popular examples (cluster array and treemap) with a link to Treemapping.

Then a short History section. I would begin with a reference to the Wiki Thematic Map article, which describes the type of geographic maps that motivated the original invention of heatmaps - color used to represent a (continuous) variable. Then give credit to Bertin, Ling and the others who invented the cluster display, to Shneiderman and Johnson, who invented the treemap, and to the others who developed the software (BMDP, SYSTAT, and Eisen).

All this could be done in an article not much longer than this one. Since I am an expert in this field, I have not changed anything in the entry. Good luck! And I would be especially interested if anyone found references earlier than the ones I cited.67.173.98.211 (talk) 18:06, 24 November 2007 (UTC)