Talk:Data-flow analysis

From Wikipedia, the free encyclopedia

This should probably be moved to wikibooks, as it is not an encyclopedic article. Thue | talk 15:07, 15 Jul 2004 (UTC)


There should be a definition of "dataflow analysis".

We can say something like : "Data flow analysis is a process for collecting run-time information about data in programs without actually executing them" --Cocothebo

Technically, it's data-flow (with a hyphen) analysis. FAdmMatt

You seem very certain of this. Why is "control flow analysis" always spelled without the hyphen? Certainly, the Dragon book uses the hyphen, but I find that most papers on the subject do not (some use a space, some neither a space nor a hyphen). I'd really like to know the definitive answer. --Mike Van Emmerik 12:09, 14 February 2006 (UTC)

[edit] not enyclopedic

Agreed that this is not encyclopedic, but disagree that it should be moved to wikibooks. If I knew anything about data-flow analysis, I would write it here, but I don't (which is why I am here). Suggested sub-topics:

  • what it is
  • why it is used
  • history/background

the FolDoc entry might be a good place to start. --Lenehey 22:36, July 28, 2005 (UTC)

[edit] Reverse postorder is not the same as preorder!

It's really scary that this article has claimed since its creation in 2004 that reverse postorder is just another name for preorder. So much for the many-eyes philosophy. This is the worst kind of mistake an article can have, because once you get an idea like this into people's heads it tends to be very difficult to get it out. I fixed the problem, but I'm placing an erratum here in the hope of catching the eye of a few of the people who were misled by the article. -- BenRG 15:04, 2 September 2006 (UTC)

Hmm, actually on rereading that paragraph I'm not sure what it's talking about. What does "except when the successor is reached by a back edge" mean? Is it just talking about the fact that you don't revisit nodes you've already visited? Unless there are objections, I think I'll rewrite this paragraph to simply say that reverse postorder is the reverse of postorder. -- BenRG 15:15, 2 September 2006 (UTC)
It probably means that you can't order a flow graph with loops in it based on successor relations only. The back edge is defining a loop in the program. Mauritsmaartendejong 11:03, 16 July 2007 (UTC)
If you rewrite it, it would be nice to have an actual example flowgraph with the different orders written out, and evaluated in terms of number of basic block visits. Mauritsmaartendejong 11:03, 16 July 2007 (UTC)
Actually, as it stands now, this entire paragraph is complete nonsense. It really depends on the actual data flow analysis whether or not the order matters. I know quite a few analyses where order does not matter at all. So I don't know if this paragraph should even be in there. In any case it should be rephrased. Ericbodden 04:02, 11 May 2007 (UTC)
Could you name a few of those analyses where order does not matter ? I've been seeing quite a few wrong orderings leading to quadratic behaviour and compilations taking hours or even days to complete which were brought back to minutes by careful ordering. I would say that this makes the notion at least relevant. Perhaps we should qualify the statement with the cases where it does matter. My intuition says that whenever you propagate information from one basic block to another, you better try to order the blocks according to that propagation direction. In my view, this propagation is the essence of data flow analysis. Mauritsmaartendejong 11:03, 16 July 2007 (UTC)

[edit] Suggestion for new structure

I think data flow analysis is an important topic in compiler optimizations, and would like to get this article in good shape.

I would propose the following topics:

  • Introduction
    • examples
    • applications, e.g. live variables in relation to dead code elimination
  • intra block data flow
    • statements
    • direction of flow
  • inter block data flow
    • basic blocks && control flow graph
    • in and out sets
    • gen and kill sets
    • join operation
    • data flow equations
    • fixed point algorithm.
      • conditions for termination.
      • work-list approach
      • work-list ordering
  • structured program analysis
  • bit vector representations
  • more elaborate applications
    • abstract interpretation
    • value range analysis
  • use of ssa
  • history
  • ...


Comments are welcome. Count me in as a contributor, but I'd hate to do this on my own.

Mauritsmaartendejong 12:18, 16 July 2007 (UTC)