Talk:Optimization (computer science)

From Wikipedia, the free encyclopedia

I know you may say load balancing is a performance tuning not optimization but one likely says "optimize the whole system by balancing load" so. -- Taku


About this addition

Usually, the optimizer does not modify actual code but produce code in the same time when compilers produces target code automatically. In other words, redundancy or any sort of inefficient code in source program remain unchanged with optimizers. To get an efficient code without optimization, one needs to optimize code manually. However, since most today's compilers have at least naive optimzers, manual optimization that can be done by optimizer is usually unnecessary.

I know this completely sounds awfully redundant for computer programmers. But what I found is that often people who have no knowledge about compliers assume optimizers actually covert code written by human to better, efficient code. And of course, we know what's not actually happening. At18, sorry but I think your revision misses this point. I will try once again. See and fix if needed. -- Taku 22:44 May 14, 2003 (UTC)

Hi, I just found your paragraph a bit confusing, and replaced it with a shorter version. You can of course try to expand it a bit to state your point better. At18 08:13 May 15, 2003 (UTC)

Contents

[edit] Too large

I think this article tries to discuss too many distinct topics at once, and should be split into different articles discussing compiler optimizations, manual optimization by humans, and performance tuning of large systems with things like load balancing. They can cross-link all they like, but this article isn't focused enough, in my opinion. Derrick Coetzee 15:39, 15 Nov 2003 (UTC)

Yes, I agree. There used to be a few optimization-related articles because they had overlapped discussion. I was thinking of spliting off but was unsure what organizations should be good. Yeah, I guess at least compiler optimization can be splited. -- Taku 15:51, Nov 15, 2003 (UTC)

Remember the other uses of optimization in computer science. I've written programs to optimize the efficiency of routing of bulk cargo ships and done outline design for a system to optimize the crude oil shipping system of the Kuwait National Oil Company and there's the ever-present travelling salesman optimization problem (which the other two are somewhat representative of). This article seems more about code and compiler optimization that the whole spectrum of optimization in computer science. I think that Derrick made a good point and that this article needs to cover the general concepts in a shallow overview, then link to the detailed articles for specific techniques in each field. JamesDay 12:34, 17 Nov 2003 (UTC)

I agree with this. Any practical suggestion regarding the organization of this article? -- Taku 19:13, Nov 17, 2003 (UTC)

[edit] Premature Optimization

"Premature Optimization" gets redirected to this page - I don't think it should. Premature optimization is a subtle problem which isn't obvious to a novice. That's why Donald Knuth chose to highlight it. I'd like to see some rules of thumb or red flags which signal that an optimization is premature. Something like, when someone says "this will improve performance two-fold and we can easily manage the slight increased risk", or, as the project nears completion, there is a mad rush to implement a few more performance improvements before release.


In my opinion (but that's just mine !), the question of deciding whether an optimization is premature is essentially a matter of experience and personnal choices, and thus a whole list of what kinds of optimizations are premature is really difficult to establish, at least in an encyclopedia. Maybe a Wikibook on this topic would be interesting, or you may look at the bibliography of this article.
King Mike 15:03, 29 January 2006 (UTC)
All optimizations are premature unless you have _measured_ the system to be too slow. And you can't measure something that is not built yet. Lost Goblin 16:34, 29 January 2006 (UTC)
Exactly. Gritzko 04:15, 16 June 2007 (UTC)
I highly disagree with this comment. It may be the case that many programmers working on many kinds of programs (particularly in large project settings) are completely clueless as to where their machine resources are going, but this hardly applies to everything one might program. In particular, the lower level the code is and the more familiar with the underlying architecture the programmer is, the better its performance characteristics will be naturally understood. It's also unclear exactly what "profiling" means. I often find it much more useful to write specialized profilers that match the nature of the program than to simply rely on function call counts, but this is what people usually mean when they mention profiling. For code that intends to be high performance, function calls may break down entirely and you won't end up with information that is nearly as fine tuned as you require it to be.
I believe that the opposite of premature optimization (waiting too late to optimize) may be equally damaging, as the more mature a program's design is the less accommodating it will be to major underlying revision. It's important to design, program, and incrementally revise with particular performance goals in mind and simultaneously maintaining a clean and elegant design, rather than to just ignore performance and believe you can optimize it later. The practice of optimization is largely a matter of understanding resource usage to begin with, so in that sense truly "premature optimization" can hardly be considered optimization at all - it is usually just application of a few trivial transformations that the user was taught, and most likely shows that the user is incapable of effectively optimizing to begin with. A proficient optimizing programmer will have a deep understanding of both the underlying architecture (machine, libraries, OS) and the tools at hand (compiler). Even profiler results just give you a broad place to begin and capture some dynamic data (that you might be able to infer simply by your understanding of the problem); if you don't understand where resources are going then you won't be able to optimize it.
Interestingly, although premature optimizations (which I would consider to be pseudo-optimizations) can damage the clarity and functionality of code, true optimizations often enhance these aspects by making code simpler and more concise. Also interesting is the sheer number of optimizations Knuth himself has given in his books. I see the "root of all evil" mantra being preached so heavily without much further explanation (in a kind of fire and brimstone way where the phrasing and authorship itself carries so much weight that such explanation is deemed redundant) that I'm absolutely certain that it has perpetuated an attitude of lazy programming that has resulted in software with sub-par performance. Experience leads me to believe that often the people pushing this phrase are (much unlike Knuth) incapable of performing serious program optimization themselves, with or without profilers. Exophase (talk) 19:10, 8 February 2008 (UTC)

[edit] Reference 'the art of programming'

I dont know exactly what parts of the article are inspired by the cited source 'the art of programming'. I know of one book entitled like that, but it has no real focus on optimization. Instead, since D.E. Knuth is cited several times in this article, the real source may be his book, 'the art of computer programming'. Am I wrong ?

King Mike

[edit] Mergefrom Loop optimization

Well, this is a classic, isn't it? (regardless if this article is too long or not). Ripper234 20:35, 2 April 2006 (UTC)

The loop optimization article looks fine on its own, and since we're trying to pare down this article I'd recommend against merging the two. CRGreathouse 23:38, 19 July 2006 (UTC)
I am against merging both articles, maybe you are looking for a more specific one, such as Compiler optimization. In fact I would propose to remove the Loop optimization article or to merge it with Compiler optimization (which seems to contain the former one).

[edit] Contradiction

Improving about 20% of code is often responsible for 80% of the results contradicts with 90% of the time is spent in 10% of the code, and only 10% of the time in the remaining 90% of the code Sjord 17:59, 14 April 2006 (UTC)

Strictly, I don't think they are talking about the same thing -- one's about writing the code and the other about rewriting it. Still, I'd be happy to see both removed since they're essentially unverifiable. CRGreathouse 23:38, 19 July 2006 (UTC)

[edit] Very high-level language

in the article it says that python is a "very high level language". I would say very high level languages are Domain-specific_programming_languages (DSLs). IMHO Python is just a high-level language and it's definitely no DSL!

--Riedel 21:57, 19 July 2006 (UTC)

[edit] Energy efficiency

This article should also address hardware and energy aspects of efficiency and optimization. But it does not currently address them at all, and has no links at all to any such info!-69.87.202.60 12:05, 17 May 2007 (UTC)

Yes it should. Please add them. Derek farn 12:31, 17 May 2007 (UTC)

[edit] Should link to "A Plea For Lean Software" be removed?

The link

refers to an article requiring paid subscription to read. WP:EL guidelines state: "6. Links to sites that require payment or registration to view the relevant content." should general be avoided. If the IEEE is hosting it, I expect it is a classic article. Nevertheless, it's not accessable without payment. Is it OK to remove the link? Regards, --Camacan 03:33, 26 July 2007 (UTC)

Could we link to something like: http://cr.yp.to/bib/1995/wirth.pdf instead? Also, the current link does atleast point to an abstract, that's not all that bad.... --DFRussia 07:57, 18 August 2007 (UTC)

[edit] System versus Program versus Code

It appears to me that this topic starts out correctly -- talks about system optimization -- then goes right down into talking about optimizing pieces of code. But these are two separate beasts. The comments about premature optimization (by Knuth and others) focuses on optimizing pieces of code. But note -- many of these quotes are very old -- 30 years or so. And they are applicable to that domain (code optimization) but not to systems optimization which holistically cover the entire infrastructure.

What do you think? SunSw0rd 17:58, 13 November 2007 (UTC)

[edit] interpreted languages

The section on interpreted languages is bullshit. Interpreters usually work in 3 phases: parsing the source into a series of tokens, parsing the tokens into a syntax tree, and then interpretation of the syntax tree. By far, the most computationally expensive is the last phase, because that's where the code is actually executed. Removing extraneous whitespace and comments only affects the first phase (the lexical analyzer), which is already extremely fast.

This section is really talking about size optimization over networks. This has nothing to do with interpreted languages. You see this for any form of data. For any data, whether it is source code, executables, or plain media, it is better to have smaller sizes so it takes less time to transfer over a network.

In conclusion, I suggest this section be removed, and possibly replaced with an obvious "data should be optimized for size to speed transfer over networks" statement.

--72.177.62.3 (talk) 22:58, 11 December 2007 (UTC)

[edit] Macros

The macros section seems inaccurate to me. First, the claim is made that C preprocessor macros are mainly limited to performing the same function as inline. I've seen this argument before, which has led people to largely condemn macros as inline is safer and more practical. In reality, C preprocessor macros are capable of many of the same things that Lisp-like macros or C++ templates are. This is because they do not merely expand to text limited to one pass but will also recursively expand all expansions. This allows macro names (or parametric parts of macro names) to be passed as macro parameters, and although the expansion is either very wide and specific or very general (as opposed to something mixed like what template specialization allows) it can be hierarchically arranged to minimalize the amount of redundant macros needed to allow for specialization. I've seen few people actually use CPP macros in this fashion but it can be done and it's quite powerful (and I regularly program with this technique).

Second, I don't understand what kind of claim is being made that "compiled code" is substituted in Lisp-like macro systems. Compiled code is neither modified nor inserted. Macro expansion is as much textual expansion as the CPP style is, it merely has more facilities for pattern matching, hygiene, self recursion, specialization, etc. It's certainly a much more powerful system however.

I'd appreciate it if some others who feel experienced with these topics could give their input. I feel that this entire section should be rewritten, although it might not really belong in this node to begin with.

Exophase (talk) 19:53, 8 February 2008 (UTC)

[edit] Opening

The third paragraph of the opening:

Optimization can occur at a number of levels. At the highest level, the design may be optimized to make best use of the available resources. The implementation of this design will benefit from the use of efficient algorithms and the implementation of these algorithms will benefit from writing good quality code. Use of an optimizing compiler tends to ensure that the executable program is optimized. At the lowest level, it is possible to bypass the compiler completely and write assembly code directly. With modern optimizing compilers and the greater complexity of recent CPUs, it takes great skill to write code that is better than what the compiler can generate, and few projects need resort to this ultimate optimization step. Although a large amount of code written today is still compiled to run on the greatest percentage of machines. As a result, programmers and compilers don't always take advantage of the more efficient instructions provided by newer CPUs. Since optimization often relies on making use of special cases and performing complex trade-offs, a fully optimized program is usually more difficult for humans to comprehend hence tends to contain more faults than unoptimized versions; though this tends to hold mostly true for those not well-schooled in the programming language it was written in.

is a mishmash of several ideas; I tried to edit it and was unable to find a good way to handle it. Any suggestions?

I see the following points:

  • There are different levels of optimization
  • Hand-coding rarely beats compilers
  • Code is often compiled without machine-specific flags
  • Optimized code is more fragile than unoptimized code

Which of these need to be covered in the opening, and need they all be in this paragraph together?

CRGreathouse (t | c) 04:33, 11 February 2008 (UTC)

I think the second point in particular should not be in the opening and should have its own section altogether, because it deserves some elaboration. For one thing, I think what you said is exaggerated (and doesn't reflect what the original paragraph said) - for some platforms such as ARM (at least the versions widely commercially available right now) where GCC is used predominantly hand coding can beat it on a regular basis. "Rarely needed" can be verified by general consensus, but claiming that hand coded ASM can rarely beat a compiler should be backed by empirical evidence, if any exists.
Exophase (talk) 18:04, 11 February 2008 (UTC)

[edit] assembler optimisation requires "great skill"

"With modern optimizing compilers and the greater complexity of recent CPUs, it takes great skill to write code that is better than what the compiler can generate, and few projects need resort to this ultimate optimization step. Although a large amount of code written today is still compiled to run on the greatest percentage of machines. As a result, programmers and compilers don't always take advantage of the more efficient instructions provided by newer CPUs."

I have to disagree, in the case of streaming instructions and common assembler optimisations the only skills required are to be able to write assembler and to know of their existence. Its very easy to out do even the MS C++ compiler with some asm... my first foray into assembler programming was for just this purpose and was so easy that my horribly inefficient inline assembler was faster. Even simple stuff like the memcpy and memset packaged in the latest runtime libraries for MSVC++ can be easily optimised by using unrolled loops and the non-temporal copy instructions... Certainly nothing more skillful than just using the provided instructions as they were intended to be used.

Another classic example is that you have to specify the instruction set in the MSVC++ compiler, whereas it is easy (though time consuming) to write code that operates on all Intel CPUs utilising the SSE/SSE2 etc... instructions as appropriate.

I think its just that these technologies are still relatively new to the compiler and so the support is not yet full or powerful. I'm sure the compilers and libraries will catch up, but there will forever be new technologies available that will allow people with knowledge of them to outdo the compilers...

Of course this only applies to a small set of optimisations, its certainly not easy in all cases... but in many cases where you want to do it it is.

I find it disappointing that assembly programming is labelled as "something hard that you probably don't need", when really its easy and useful... particularly when you run out of optimisations to do... you may even find that optimising memcpy or memset will provide a signifcant boost (in a ~512x512 software triangle renderer used in Winamp AVS the speed gain from zeroing memory with assembler was about 3-4fps, from 27fps up to about 30fps).

I'm going to edit this statement to remove the "great skill" part, it seems out of place even once I ignore my personal feelings on the matter... "it is most often diffcult" is better imo... —Preceding unsigned comment added by 194.168.3.18 (talk) 15:28, 25 March 2008 (UTC)

oops... not logged in

-- Jheriko (talk)  —Preceding unsigned comment added by 194.168.3.18 (talk) 15:31, 25 March 2008 (UTC) 

[edit] Dispute of sentence

"In some procedural languages, such as C and C++, macros are implemented using textual substitution, and so their benefit is mostly limited to avoiding function-call overhead."

Anyone that's read the inlining article on wikipedia would know that removing the function-call overhead is the -least- effect that inlining code has. Constant folding and dead code removal have a far greater impact. I'm not game to change it myself, as I don't know anything about functional programming languages. Themania (talk) 10:19, 5 April 2008 (UTC)