Talk:Distributed computing
From Wikipedia, the free encyclopedia
[edit] P2P
The current second paragraph here seems extremely weak.
With the recent work in JINI, it seems to me that much more could be said about P2P. I don't feel awake enough at present to say it!
Bernfarr, 04:47, 19 Oct 2002
[edit] openmosix
is openmosix cluster or distributed?--dgd
[edit] Too specific
I think this a narrow view of the subject. I feel that "Distributed" means that processing for a task takes place on more than one machine. This view incorporates classical interpretations of the term as used in Distributed COM (DCOM), for example. It also is friendlier to interpreting client-server and 3-tier architectures as distributed, which I think they are. --168.98.178.80, 17:46, 23 Feb 2004
I think this is a too broad view of the subject. Why is there a Multiprocessor systems section? They are not distibuted at all! --84.220.108.55 12:26, 27 August 2005 (UTC)
[edit] Effects on component life, energy consumption, etc
Does anyone have any data about the effect running distributed computing programs has on component life, energy consumption, etc? --203.26.206.129 03:37, 13 Apr 2004 (UTC)
-
- I second! I was going to ask that myself! FireWire (talk) 07:04, 19 March 2008 (UTC)
[edit] rm: Project Dolphin
- Project Dolphin takes a count of the number of keys you press on your keyboard. This is mostly an event made of teams.
Removing this non-information from the article. There is no description of at least a goal of the project, nor links to Wikipedia or elsewhere. The one link I found, http://dolphin.twistification.net, seems to be dead. And what is an "event made of teams"?
—Herbee 18:42, 2004 May 24 (UTC)
[edit] Distributing Operating System - separate article?
Distributed operating systems are quite different than distributed computing and they deserve an article on their own, IMHO. I will wait for answer for some time and then start such article if nobody will oppose it.Szopen 11:47, 16 Sep 2004 (UTC)
Plan 9 is one of the distributed operating systems
-
- I agree, this article doesn't say anything about what a distributed operating system is, yet distributed operating system links to this article. --Abdull 10:04, 5 December 2006 (UTC)
[edit] Distributed computing - a category on its own?
Also, here we use "distributed computing" as a name for whole class of problems, of which recent fashionable terms like grids etc are just another one incarnation. I think the topic deserve a category on its own, and articles on problem such as FLP impossibility result, failure detectors, software distributed shared memory, prediction, etc etc Szopen 12:10, 16 Sep 2004 (UTC)
[edit] DISTRIBUTED Computing
Why Are Y'all steering folks AWAY from DISTRIBUTED Computing??Folding@Home ?!?!?!? I will join either organization that get's my 'LINDOWZ' linux machine to work on Work-Units... Even 'Koppix' LOL I've done 25 F@H WU's while folding [Console mode] along with GooGle-Labs-F@H [Client mode]on my WinTel CPU and I'm dieing to try a install with a FLASH Drive - I think it would be a 'GAS' ! [$25/128megger-stick ] adding the most successful 'Folding At Home' [Protiens] to the catagories [ I hope ! ] re: http://folding.stanford.edu/faq.html , http://folding.stanford.edu/FAH3.html --144.160.5.25, 00:54, 2 Dec 2004
- Nope not steering away, just biding our time. -- Dbroadwell
[edit] Definition
There is no definition what is 'distributed computing' in this article. In my opinion distributed computing means programs located on physically separated computers which must communicate in some way in order to complete a given computing task. saer
And what would be the difference between Distributed systems then? Szopen 11:33, 28 Jan 2005 (UTC)
BTW, i see no sense in Distributing Operating SYstem pointing to Distributed computing. It's absurd. DOS is part of DC, but it deserves and article on its own! Szopen 11:34, 28 Jan 2005 (UTC)
[edit] Distributed computing environment
Distributed computing environment is probably 'See also' page. --Michal Jurosz 13:19, 6 Feb 2005 (UTC)
[edit] Big Merge!!
I saw merge tags on Distributed programming and Distributed system and wihtout a blink I merged them here as sections. I'll revisit to cleanup later as I need this article in better shape. This article needs a structure, anyone have ideas? -- Dbroadwell 02:42, 5 Mar 2005 (UTC)
- Ok, cleaned it up a litte in that i shuffled the paragraphs together is a somwhat logical order, the ones that were less technical first. Tagged for cleanup too. -- Dbroadwell 03:52, 5 Mar 2005 (UTC)
- Ok, I will revert that, ok? I believe, as I stated before, that Distributed Systems and DIstributed programming need article on their own. They are both DC of course, but that does not justify merging. E.g. History of Poland is of course part of History of Europe, as well as history of say... Belarus but that does not mean that History of Poland should be merged with History of Belarus as sections in History of Europe!
- Well, people have ignored my statement, but I will not behave like others. I will wait few days before revert. Maybe Distributed system is somewhat redundant, but definetely not distribued programming. -- User:Szopen 07:19, 5 Mar 2005 (UTC) (signarure pulled from history by Dbroadwell)
- Please sign your comments. Raul, whom is working on his distributed computing PhD, is making a outline for this article. If distributed programming has enough for an article itself it will get it, I think it will ... for now I was trying to centralize the mess to cleanup and copyedit it. In short please wait till after this weekend before going on a wholesale reversion spree. -- Dbroadwell 20:34, 5 Mar 2005 (UTC)
Ok, bit outline applied. To be honest the information that was in 'Distributed programming' was jsut a nice list of distributed computing archetectures and so, really does belong here. Not that 'Distributed programming' doesn't deserve an article of it's own, just not with the information that was there. -- Dbroadwell 06:20, 7 Mar 2005 (UTC)
I think a distinction needs to be made between distributed systems, w.r.t. computing, and distributed systems w.r.t systems theory (i.e. a system with an infinite number of state variables). I'm not an expert on the subject, but these are completely distinct topics.--PerfectZero 06:58, 9 August 2006 (UTC)
I deleted the link from Fallacies of Distributed Computing to Distributed Programming that was redirecting to Distributed Computing. The merge described above appears to be complete, except for these sorts of loose ends. Philomathoholic 08:00, 12 December 2006 (UTC)
[edit] does this go to supercomputers?
Just a bit of stuff from the outline that User:raul654 gave. I'm thinking it should go into supercomputers untill i fill it in a bot mroe ... -- Dbroadwell 06:20, 7 Mar 2005 (UTC)
[edit] History of supercomputing
mention Illiac
[edit] Types
[edit] SISD
single instruction stream, single data stream
[edit] MISD
multiple instruction stream, single data stream
[edit] SIMD
single instruction, multiple data
[edit] MIMD
multiple instruction, multiple data
[edit] Tanembaum
As stated by Andrew S. Tanenbaum, "Distributed systems need radically different software than centralized systems do."
- Heh, this quote made me register only to talk about it. Wouldn't that "radically different" thing be because of the way the software is made? If Tanenbaum meant all the kinds of software need to be developed in radically different way than in centralized systems, I'd say he's plain wrong. Take a multilayered model of developing softwares. Higher layers communicate with lower layers with a common language or objects. In that way distributing tasks between computers would be just a matter of distributing more objects. You just need to develop good layers. Well, this is also a model that solves a whole lot more problems that that, but it's just an example. --Renrutal 08:49, 1 Apr 2005 (UTC)
[edit] definition mixup
The article parallel computing defines distributed computing as one of its methods. The far-reaching analogies in this article about the world wide web and whatnot sound really off-key. It looks like a bunch of duplicated, less than coherent, mess. --Joy [shallot] 00:40, 19 Jun 2005 (UTC)
- I removed the world wide web section, because:
- The section itself even mentioned it: it is "not distributed" computing. It described how computers communicate with servers, in order to get a webpage. No computation as described in the main article is involved in this process.
- It was poorly written
- Addressed the reader ("as you are reading a webpage")
- Parts repeated ("as you are browsing the web"), bad explanations. It just sums up parts of the web that are 'distributed systems'.
- Irrelevant information, EVEN for the explanation of 'distributed systems' in the WWW, like proxy servers (which are serial not parallel).
- SuperMidget 17:16, 28 March 2007 (UTC)
[edit] Unbounded nondeterminism misplaced?
One of the challenges listed in the Openness section does not seem to be associated with openness. Unbounded nondeterminism is a problem associated with some protocols (most notably contention protocols) and with some levels of unreliability (with high enough reliability the time it takes to complete an operation need not be unbounded). But it does not seem to be related to the publishing of complete and neutral component interface specifications (a reasonable definition of openness). However, I am prepared to be convinced otherwise! Tim Watson 13:01, 10 September 2005 (UTC)
- Thanks for your comment. I have clariied the text. See what you think.--Carl Hewitt 15:57, 10 September 2005 (UTC)
Thanks Carl. It's clear that the definition of open systems is still evolving. I remember when it meant UNIX or the OSI 7-layer model! Tim Watson 16:41, 10 September 2005 (UTC)
[edit] Web services reference: just a marketing claim?
I feel that the claim that Web services is the dominant technology is more like a marketing claim for them than truly reflecting the situation, e.g. neither the SETI@home project nor Linux clusters use that technology.
--David Woolley 12:47, 15 October 2005 (UTC)
- These days everybody is developing Web Services including distributed computing systems on Linux. Some of the older systems such as SETI do not (yet).--Carl Hewitt 21:08, 15 October 2005 (UTC)
Actually, the goal statement is wrong in a number of areas.
1) The main goal is not to connect users in a 'transparent' way. This is a goal of one type of distributed system, the remote procedure call (RPC) [1], not distributed systems in general. For example, Java's RMI and Jini system believe the opposite, that the distributed system should not be transparent to users [2].
2) Distributed systems are not 'open' because the protocols they use are different for different types of distributed systems. For example, DCOM doesn't communicate with Jini. What they do provide, however, is a standard communication protocol. This is not the same as open.
3) The statement 'Today Web Services provide the standard protocols for connecting distributed systems' is simply not true. Firstly, web services are not a protocol, they are a data representation standard or an implementation of XML for data transmission. A web service is simply a service that accepts XML data in some format or another. The actual protocol is normally HTTP but it can, and does, run over many other protocols. Still, generally speaking today web services run over HTTP and so the actual protocol in that instance is HTTP, not 'web services'. This is pure meaningless marketing hype.
Secondly there is no single defined web services standard in that there are many different implementations, such as REST (of which there could be an infinate number of formats), SOAP (various versions), and various vendor extensions.
In terms of dominant protocol, that has to be HTTP.
[1] Birrell, A.D. & Nelson, B.J. "Implementing Remote Procedure Calls." ACM Transactions on Computer Systems 2, 1 (February 1984): 39-59.
[2] Waldo, J., Wyant G., Wollrath, A. & Kendall, S. "A Note on Distributed Computing." Technical report, SMLI TR-94-29, November 1994.
[edit] List of researchers not appropriate
Especially given that this is a amaster topic for many different sorts of distributed computing, I feel that a fair list of researchers would run into thousands, which would be unreasonable for inclusion in this article. I feel the current list is somewhat arbitrary, especially as, of the initial four, one isn't present at all, and two others are stub articles, with, in once case, this article being the only substantive referer and in the other case, there being only one other referring article.
--David Woolley 18:18, 26 October 2005 (UTC)
- I concur.--Carl Hewitt 19:38, 26 October 2005 (UTC)
[edit] Cleanup Intro
I wrote a new intro and removed the quote from Tanenbaum as it did not contribute anything useful to the topic. This article is in dismal shape. Types of processors should not be included such as SISD, SIMD, MISD, MIMD. If you want to mention this, you could mention the types of systems that can be used in distributed networks and how smaller parallel computers and even clusters can be connected together to make a distributed environment. But as it is now, it doesn't belong.
The Goal section is ambiguous. The last line about web services is out of place. Why should we point that out versus other software technologies? Move it to the section just below it about examples.
The Openness section seems like it's taken from a specific product with someone pushing it. Just an impression mind you.
Multiprocessor section doesn't add anything useful. It's off-topic and doesn't belong here. The following section is ok though about multicomputer systems and should maybe highlighted or expanded.
The architechture and concurrency section seems good to me as well, but there's something about the formatting that is not appealing.
--Budd (Anon) Feb 17, 2006
- I agree with much of what you say about the article in general but the new introduction is too wordy and too specific. The previous intro described distributed computing as the coordinated use of physically separate computers. This seems to be a very good definition. Distributed computing includes any coherent system that uses a network of computers, the computers don't have to be remote, the protocols don't have to apply to a range of hardware and the code and information doesn't have to be portable. What ties the subject together is the common set of problems and techniques: communication between components, managing state, naming and locating, synchronisation, fault tolerance etc.. These topics apply just as well to a vehicle performance management system as they do to SETI@home or a CORBA-based distributed system for business. Tim Watson 07:29, 18 February 2006 (UTC)
[edit] A "List of Problems in Distributed Computing"
I am interested in helping to add a section on "Problems in Distributed Computing" such as "consensus," "mutual exclusion," "leader election," "dining philosophers" and so on--essentially, the classic problems studied researchers in distributed algorithms. These are precisly-defined problems in the literature, which have a variety of solutions (or lack them!) depending on the specific model of distributed computing that is chosen.
Trying a few, "dining philosophers" is the only one of these which already has a page on wikipedia.
Possibly each problem could merit its own page; or perhaps it would be better to collect them all on a page called "Problems in Distributed Computing."
Looking for similar collections for hints how to organize this, I see that there is a "List of complexity classes." A "List of Problems in Distributed Computing" could be considered to have a similar breadth and depth. There is a "List of NP-complete problems," but apparently no other list of sequential algorithmic problems.
Do others agree that this content merits a "List" of pages? Or should it be localized on one page for now?
I can write some content for this collection, but I'm by no means an expert. I hope others will be interested in contributing.
Is this a good idea? Any suggestions? Ezrakilty 18:29, 30 March 2006 (UTC)
- A new single section that summarizes and then links to the separate pages on each of those algorithmic problems seems like a good idea. -- Bovineone 18:33, 30 March 2006 (UTC)
[edit] Spammy lists
There have been a lot of lists of links created in both the Distributed computing and Grid computing articles. I originally added the {{cleanup-spam}} because those lists were beginning to over-power the main content of the articles. Perhaps some of them can be moved to new articles so that people will be encouraged to add them there instead of in the main articles? Although this may be bordering on what wikipedia is not, it may still be possible to create a "List of distributed computing journals" and "List of distributed computing organizations" or a "List of distributed computing software" to complement the existing List of distributed computing publications and List of distributed computing projects? -- Bovineone 17:57, 15 May 2006 (UTC)
- Perhaps we could just link to the dmoz directories of distributed computing journals, conferences, projects, and general distributed computing links. The List of distributed computing projects primarily lists projects about which there is a Wikipedia entry. This (kind of) makes sense, since the Wikipedia links won't appear in something like dmoz. But assembling a list of purely external links seems like it would be contrary to what wikipedia is not. Wikipedia is not a directory. Dmoz is a directory, so why not make use of it? --Allan McInnes (talk) 18:52, 15 May 2006 (UTC)
[edit] Shared computing
How is this different? --M1ss1ontomars2k4 | T | C | @ 00:11, 22 May 2006 (UTC)
- Distributed computing in general is just computation using multiple computers (possibly even dedicated clusters that do not need to "share" their use with other workloads). Shared computing is a subset of distributed computing where cycle stealing or idle cycles only are used, usually by people volunteering their PCs. -- Bovineone 04:56, 22 May 2006 (UTC)
[edit] Merge proposal
See the discussion in Distributed programming. I think the two articles should be merged. Hervegirod 09:29, 2 September 2006 (UTC)
This is done now. All the deleted text is in the Distributed computing article, may need to be cleaned up a bit Hervegirod 20:14, 27 September 2006 (UTC)
[edit] Where the CSP and web-service models??
Communicating_Sequential_Processes is the more cited "abstract model" for concurrence on Distr. Comp. CSP not cited on the article!
... Web Service Architecture (see on W3C) is the "top model" today, and it not have a section on the article.
—The preceding unsigned comment was added by 201.43.55.80 (talk • contribs) .
- If you feel there should be more discussions of CSP and WSA then by all means add them (with references please). Wikipedia is open to everyone for editing. --Allan McInnes (talk) 04:05, 26 October 2006 (UTC)
[edit] Other examples... ?
distributed commuting?
--Renice 16:48, 4 January 2007 (UTC)
[edit] Contradiction
The article first states that the World Wide Web is not a distributed computing system, then at the end of the article that it is one example of distributed computing system. --Edcolins 10:33, 21 January 2007 (UTC)
- At the end it says "An example of a distributed system is the World Wide Web". A distributed system is not the same as distributed computing. The article could make a clearer distinction and have a section titled "Distributed systems". Or maybe Distributed system should get its own article again [1] instead of redirecting here. PrimeHunter 16:49, 22 January 2007 (UTC)
- I made a quick fix [2] to avoid readers thinking there is a contradiction, but distributed systems should be introduced better. PrimeHunter 16:58, 22 January 2007 (UTC)
- I agree that the article confuses what is sometimes called Distributed Systems and what is sometimes called Distributed Computing. Having a "Distributed Systems" section is not right; it should be the other way around since distributed computing is more specific than distributed systems. Or as you suggest they should be split into separate articles. --Nethgirb 23:51, 22 January 2007 (UTC)
- I think an article can have a section for a more general concept, e.g. if the specific article concept is more known/notable than the general, but I'm not judging this case, and I don't expect to work on it. PrimeHunter 13:19, 23 January 2007 (UTC)
- I agree that the article confuses what is sometimes called Distributed Systems and what is sometimes called Distributed Computing. Having a "Distributed Systems" section is not right; it should be the other way around since distributed computing is more specific than distributed systems. Or as you suggest they should be split into separate articles. --Nethgirb 23:51, 22 January 2007 (UTC)
- I made a quick fix [2] to avoid readers thinking there is a contradiction, but distributed systems should be introduced better. PrimeHunter 16:58, 22 January 2007 (UTC)
[edit] Drawbacks
If you want drawbacks, I've got tons. Someone will have to search for references though. So I'll just list them here.
- Reliability. There is no guarantee that communication will arrive at another node in a timely fashion. You can't even be sure that something happened at all. This is a common problem. TCP/IP must deal with this too.
- Security. Because all nodes usually use a public network for communication, there is the risk of attempted unauthorized access.
- Multiple Platforms. Althought there is no requirement for multiple platforms in general, there are many applications that require it such as web browsing. Getting all platforms to have its respective version of the software takes a lot of work. Getting them all to give the proper results is even more problematic. Then there's maintaining all those different versions of the software.
- Coordination. All distributed environments must be coordinated. Much like a workplace, you can't have everyone in one big room haphazardly running around. Everyone must have its task. Same goes for nodes in a distributed environment. There needs to be a way to identify nodes, add nodes, remove them, and change what task they are doing all while distributed computing is going on.
- Protocol. From within a programming language, you can send data using language features such as objects, structures or messages. In a distributed environment, everything you send to other nodes must be formatted in a way that the receiving end can make out what it is. In short, everything that goes out on the wire must be made similar to being persistant. But instead of saving and loading from disk, it is sent out and received through the communication channel.
There's a lot more, but those are the main ones. —The preceding unsigned comment was added by 142.167.85.107 (talk) 07:33, 16 February 2007 (UTC).
All those issues are effectively dealt by literally hundreds of scientists. Also, coordination is NOT needed - it depends on the needs. Szopen 12:11, 13 September 2007 (UTC)
[edit] Volunteer Computing
The long paragraph under "Drawbacks and Disadvantages" that begins, "Distributed computing projects may generate data that is proprietary to private industry" is actually referring, I think, to the phenomenon of [Volunteer computing] which is a specific use of distributed computing that harnesses the computing resources of volunteers, and which already has its own page. The comments in that paragraph do not apply to distributed computing as such, which is a technical subject and which need not (and often does not) have any of the proprietary questions that the paragraph raises.
I propose that this paragraph be moved to a "Drawbacks and Disadvantages" section on the Volunteer computing page. Perhaps explicit mention of the relationship between the two topics should be made on the Distributed computing page.
I notice that in some other ways, the notion of volunteer computing is being confused with distributed computing here. For example, the intro mentions BOINC, a framework for building volunteer-computing systems, as an example of distributed computing, without mentioning the special nature of projects such as BOINC. Also, the "Examples" section describes only large-scale volunteer computing projects. Although it does say that "many are run on a volunteer basis" it gives the impression that all or almost all distributed computation is of this sort, which is not true.
I can try to flesh out the "Examples" section and otherwise clarify this distinction. Ezrakilty 11:17, 4 July 2007 (UTC)
[edit] Drawbacks
I'm seconding the comment above.
In reviewing the drawbacks section of this document, the possible failings of projects that use volunteer computing networks seems quite out of place. All projects have the possibility of being poorly planned, having outcomes that are not useful, of wasting time and money, etc, etc. Why should distributed computing be exceptional in this way? I would remove the discussion, or link it to a discussion on SETI@home, etc.
[edit] Quite confuse on distirbuted system
Can any one give real time instance of distirbuted system? Regards, Balaji.K.Jeyaraman —Preceding unsigned comment added by KJBalaji (talk • contribs) 00:47, 13 September 2007 (UTC)
- Just read the article and look at the recommended sites. --Statsone 04:46, 13 September 2007 (UTC)
[edit] DSM vs Space-based
I am quite confused by definition of "space-bsaed" computing. I was teached that tuple spaces are just one of example of distributed shared memory paradigm. Can someone explain, why space-based is different from distributed shared memory and why space-based is used here instead of DSM? Szopen (talk) 13:11, 17 April 2008 (UTC)