Talk:Shannon–Hartley theorem

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Telecommunications, an attempt to build a comprehensive and detailed guide to telecommunications on Wikipedia. If you would like to participate, you can edit the article attached to this page, or visit the project page, where you can join the project as a "full time member" and/or contribute to the discussion.


Contents

[edit] Shannon's theorem?

I don't think this is Shannon's theorem; it is simply his definition of informational entropy (= expected amount of information).

Shannon's theorem is a formula for the maximal rate at which you can send information down a pipe if you know the bandwidth and the signal-to-noise ratio. AxelBoldt

Fair enough. I was just replacing garbage, and took planetmath.org's word for it. One undergraduate course aside, I know no real communication theory -- User:GWO

I would like to see a reference on this page to the means used to obtain the maximum rate: Shannon's reasoning was based on assuming an equivalence between signal space and N-dimensional euclidean space; noise power determined the size of the spheres, the number of dimensions being the number of possible signal tuples, and the result being how to pack the maximum number of spheres into that space, for which the limit was well known. Bukowski

[edit] Digital vs analog bandwidth?

Wikipedia just needs more general information about the relationship between digital and analog bandwidth. What other articles/subjects are related to this? - Omegatron 16:30, Apr 24, 2005 (UTC)

[edit] Modem comparison needs peer review

Original: The last paragraph (especially "V.90 claims a rate of 56 kbit/s, apparently in excess of the Shannon capacity") is wrong. The V.90 data rate does not account for compression.

Edit by Mark Rejhon: This modem information has now been clarified from Mark Rejhon's edit about a month ago. This needs some peer review. Also, 56 kbit/s is achieved without compression, or rather, 53 kbit/s due to FCC regulations on signal level. But this is done, without compression! It's rather, done, via avoiding a digital-to-analog conversion step, at the telco end. This information is widely available on the Internet, but needs to be confirmed with accurate sources.

From my study of the Shannon channel capacity theorem, compression doesn't matter. Information is information, compressed or not. To the best of my knowledge, Shannon made no assumptions about the coding of information. This is his real genius. In general, his analysis is based on "nats", not bits (base e as opposed to base 2). Binary is a convenient coding, so most people talk about bits. As far as I know, Telcos typically sample at 8kHz with 8 bit samples. This means a Nyquist limit of 4 kHz an ideal quantization noise of about 48 dB (maybe even 54 dB if you use half of 256). If you lived 10 feet away from your central office and there was no other interference, you might get that kind of S/N. Reality is that you are probably thousands of feet from the CO and there are hundreds of other lines next to yours adding noise. I am told that 30 dB S/N is possible with BW of 3.5 kHz. This works out to about 35 kb/s. If you assume 48 dB, the channel capacity is 55.8 kb/s. Close to that mystical figure of 56 kb/s, I would think. "Principles of Communication Systems" by Taub and Shilling is a great book. Madhu
If you have like, encyclopedia.txt, a 100 MB file, and you compress it down to 1 MB, send it in one second, and them decompress it later, you could say that you had sent 100 MB/sec. I think that's all they meant. They are talking about the actual transmission speed (in this case 1 MB/sec). Apparently modems can also do on-the-fly compression, which would give effective rates greater than the actual channel capacity for easily compressed data. - Omegatron 16:38, Feb 17, 2005 (UTC)
In my understanding, anyway, the theorem only applies to the actual data being sent, regardless of whether it is compressed data, binary, or all zeroes. In other words, compression of the data does not count towards getting closer to the theoretical max. - Omegatron 17:02, Feb 17, 2005 (UTC)
I think we're saying the same thing. The important detail is the terminology. Shannon talks about information rate as opposed to bit rate (he doesn't really talk about bits). In your above example, the information rate is 1 MB/sec. If I have a 100 MB file with truly random data that cannot be compressed (try compressing a compressed file sometime ;-), then the bit rate drops to 1 MB/sec instead of 100 MB/sec. Another way to say this is that a highly compressible file contains less information than an incompressible file. Madhu

[edit] Several versions

There used to be five different versions of this article. Now there are three. Over the last while, several people have merged these articles (myself included, for one of these merges, but it appears I am not the only one). However, there are still two more articles that still needs to be merged into this article.

These are now the same article:

We need to clarify the differences or merge the similarities between these articles. From what I understand, they are all the same thing, and should be merged and redirected to Shannon-Hartley theorem. Also clarify the relationship to Nyquist-Shannon sampling theorem. (One deals with digital data and one deals with sampled analog data, correct?). Then clarify in bandwidth the relationship between regular analog bandwidth, sampled data bandwidth, maximum digital bandwidth, etc. - Omegatron 20:21, Aug 9, 2004 (UTC)

There are two different equations, and two different concepts, but I feel they are related closely enough and have similar enough names that they should all be combined into one article, and the differences and similarities explained within. This way people searching for one will not get confused, not realizing that they are looking at the wrong article. - Omegatron 20:02, Sep 18, 2004 (UTC)

[edit] Difference between laws

According to this site http://www.cs.nmsu.edu/~jcook/Classes/DE-CS484/Physical-1.html the "law" section is actually part of the nyquist theorem.

According to this site http://www.fas.org/man/dod-101/navy/docs/es310/DigiComs/digicoms.htm the laws are:

R = W Log2 M

and

C = R Log (1 + S/N)

where:

R = the rate at which data can be transferred, given in bits per second (also known as the baud rate)

W = the minimum bandwidth required to create this pulse

C = capacity in bits per second (bps)

S/N = signal-to-noise ratio (depends of modulation type and noise)

I'm confused. - Omegatron

This "cheat sheet" describes it better:
http://tcode.auckland.ac.nz/314.5.pdf
The capacity related to signal levels is only for noiseless signals. The other formula is a better one and takes noise into account. It also says that the noiseless one was derived by nyquist, so i think the one labeled "shannon-hartley law" is actually the shannon-nyquist formula. - Omegatron


I think you've combined the topics correctly except there isn't much connection (at least in my mind) between Shannon's Law and the Nyquist sampling rule. On an unrelated note, I found the D = 2B log2M noiseless formula to be confusing, as it's not part of the theorem it's just part of the thinking leading up to the theorem. So I've replaced it with a written version of the same concept which uses a "thought experiment" model instead of a formula to describe how to transmit infinite information over a fixed-bandwidth link (and how noise makes that impossible in practice). technopilgrim 20:36, 19 Sep 2004 (UTC)

Alright. the first equation was labelled as shannon's law in the article of the same name. i was just merging. they seem to be seen together a lot in articles online though... i'm skeptical that it should be removed. maybe added to the nyquist-shannon article instead? - Omegatron

My problem with R = 2*B*log_2(M) is it seems to be neither fish nor fowl. In we were giving the full derivation of the S-H theorem this is an important and non-trivial milestone on the way to the full proof and we would definitely want to include it (as do professors when giving an extended talks on the topic). But we are trying to write a concise encyclopedia article and we can't assume we are addressing an audience of engineering students. Where does the 2*B factor come from? Shouldn't it be simply B? These are advanced questions in information theory (which is why radio communication was decades old before anyone understood this). Not to say that a formula can't be a great way of showing things in some situations, but here it is perplexing (part of the problem is that the explanation accompanying the 2*B*log(M) formula is not quite correct). technopilgrim 19:27, 20 Sep 2004 (UTC)

I like your description. It is clearer than the formula. Thanks. - Omegatron 18:22, Sep 21, 2004 (UTC)

"on each cycle" should be changed to each clock or each transmission or each pulse or something - Omegatron 18:30, Sep 21, 2004 (UTC)


[edit] Steganography

Example two concludes: "This shows that it is possible to transmit using signals which are actually much weaker than the background noise level, as in spread-spectrum communications."

Would it be fair to say that that is the principle behind steganography? --Elijah 23:48, 2005 Jan 3 (UTC)

No, at least, not for cryptographic steganography. I should say first that steganography is a much slipperier subject than cryptography - it's harder to specify and analyze. But a spread-spectrum signal need not be any harder to detect than a narrow-spectrum. A signal that is transmitted at a very low bitrate may be difficult to detect, but even this is not necessarily true. In steganography, the message is normally hidden inside another message. The study of analog, technical ways to transmit signals secretly is not really within the purview of steganography - it includes tricks like line-of-sight infrared lasers, transmissions as short, high data rate bursts, and other esoteric tricks. --Andrew 05:04, Jan 4, 2005 (UTC)

[edit] Comparison

In the "R < C" section, is C (the capacity) of the unit "bits per second", as in "my modem has 56 kbit/s"? --Abdull 06:51, 16 October 2005 (UTC)

In strict terminology, C should be measured in "bits per channel use", which is a ratio between "bits per second" and "Hertzian bandwidth available for transmission". In this case, C is not normalized and it is in bit/s, but it is not referred to the modem speed (it is rather R): think of it as the capacity, in liters/s, of the lavabo tube (the channel) which empties the pasta pot boiling water (your modem). Cantalamessa 12:30, 8 December 2005 (UTC)

[edit] Error in reference list?

Isn´t there an error in the reference list: Shannon´s seminal book from the 40s is "The Mathemetical Theory of Communication", whereas "The Mathematical Theory of Information" (which may prove to be seminal as well) was written by Jan Kåhre in 2002.

timonju

Fixed. Well spotted! -- Jheald 11:42, 12 February 2006 (UTC).

[edit] Error in noisy-channel coding theorem

Before the change I just made, the statement of the theorem implied that when rate equaled capacity, error could never be made arbitrarily small. This is clearly wrong; a lossless channel can achieve its capacity rate and, in spite of its being somewhat degenerate, does fall within this framework. The 1 December 2005 "fix" was wrong (though what it attempted to correct was also wrong). I've fixed this so that it's clear that the R=C case is not addressed in the noisy-channel coding theorem, but someone might want to double-check my wording on this (which is also in Noisy channel coding theorem). Calbaer 23:24, 25 April 2006 (UTC)

There's no such thing as a "lossless" additive gaussian noise channel, so whatever you did is suspect. Dicklyon 06:00, 1 September 2006 (UTC)

[edit] New edits

Since the article is named after Hartley and had almost nothing about what he showed or how it related to Shannon's capacity, I did some major edits. I took out the irrelevant thing about the data modem, and editted the other bits. And I tossed the alternate forms where instead of 1+S/N it used (S+N)/N, as just too obvious and redundant. I kept the one that commonly appears in books.

If any of this is not cool, please say so or fix it back, rather than reverting the whole lot. Dicklyon 06:00, 1 September 2006 (UTC)

[edit] What is Hartley's law?

User:First Harmonic has rewritten the Hartley's law section in terms of general pulse rates and channel capacity. The way I read the literature, it was better before. Hartley knew nothing of channel capacity, and his law was based on bandwidth. At least, that's what I find in the refs [1] [2] in combination with John Pierce's books that I have.

First Harmonic, do you have any references for your changes? Dicklyon 04:04, 4 September 2006 (UTC)

Here's the Pierce ref: [3]

I think it is somewhat a matter of semantics. Hartley's Law as it was described here prior to my recent edits consisted of an equation relating what was called "rate" to bandwidth B and the number of distinguishable levels M. I do not know whether Hartley called this quantity the rate or the capacity -- I will take your word that he used the term rate. First Harmonic 10:44, 4 September 2006 (UTC)
Nevertheless, it is clear that the quantity described is the channel capacity, and not the actual tranmission rate. The bandwidth and the number of distinguishable levels set upper limits on the transmission rate, but clearly it is possible to tranmit data at a pulse rate lower than the Nyquist rate 2B, and to use fewer levels than the number that are clearly distinguishable. That suggests that Hartley's Law provides the upper limit on the transmission rate, which from a purely semantic point of view should be called the channel capacity as opposed to the actual transmission rate. I think it is important to make this distinction clear to the reader, even if we decide not to call it "Hartley's Law". First Harmonic 10:44, 4 September 2006 (UTC)
I think there is a way that we can reach a compromise that would satisfy both your concerns about the historical accuracy and my issue of semantics and clarity. Perhaps we need to make it clear that although Hartley did not see things in these terms, that the current understanding in light of Shannon's work suggests that although Hartley called it the rate, he was actually talking about what we now would call capacity. First Harmonic 10:44, 4 September 2006 (UTC)
I think the connection between Hartley's information rate and Shannon's capacity is more subtle than you're making it out to be, and I already tried to describe that connection after introducing the capacity formula. The concept of capacity as defined by Shannon is a wonderful thing, which Hartley was completely unaware of. The rates achievable at a reasonable error rate from Hartley's law will generally be much less than Capacity, because M has to be limited to keep the error rate down. To get close to Shannon capacity, you need a much larger M and an error correcting code.
I'll make a cut at a revision and we'll see where we stand. Dicklyon 16:54, 4 September 2006 (UTC)
One more thing: data transfer rate is a practical thing with tenuous relationship to Hartley's more theoretical concept of information rate. Dicklyon 16:55, 4 September 2006 (UTC)
OK, I redid a bunch of stuff, trying to make the Hartley and Shannon contributions and relationships very clear. Let me know if you object to any of that, or if you have sources that contradict it. Dicklyon 17:53, 4 September 2006 (UTC)
ps. Let me re-emphasize that Hartley's rate is NOT what we understand as a capacity except in the case that the channel is an errorless M-ary channel of 2B symbols per second. This is NOT the channel we want to analyze, which is the additive white guassian noise channel of bandwidth B. Now that I think of it that way, I guess I should add that to say why Hartley's rate is sometimes a capacity.
Done. And fixed up a bit what it says about capacity, and linked it, etc. Dicklyon 18:18, 4 September 2006 (UTC)
I don't agree with the latest changes that you (Dicklyon) have made. Even if you are correct from a historical perspective, I don't think it is correct from a mathematical or engineering perspective. I concede that it is important to provide the historical context for how information theory developed throughout the early 20th century, but not at the expense of clearly explaining the latest and most up-to-date understanding of the theory and practice. The historical development might deserve it's own section to give readers a sense of how these ideas arose and who contributed the key breakthroughs, but that should come after a clear statement of what the theory says, what it means, and how it applies to the real world. As I said in my comments above, I think it is possible to find a middle ground that will satisfy both of us. Unfortunately, you reverted just about everything that I was trying to accomplish with my edits of last night. As I also said above, I don't think you and I really disagree about anything substantive, but only about semantics and emphasis. First Harmonic 20:34, 4 September 2006 (UTC)
On further review, I think that most of the changes you (Dicklyon) made today are really quite good and helpful. In particular, the paragraph that you added discussing how some authors call the Hartley rate a "capacity" for an idealized M-ary channel captures a lot of what I was trying to accomplish. You also cleaned up the wording quite nicely. I do, however, think that it needs a bit more explanation of the difference between an actual bit transmission rate versus a channel capacity. I will make an attempt to add some stuff, and then you can take a look and tell me what you think. First Harmonic 21:01, 4 September 2006 (UTC)
Thanks for your reconsideration. I look forward to your next edits. Dicklyon 22:07, 4 September 2006 (UTC)
Hi Dicklyon: In the sub-section entitled "Hartley's law," you use the term "achievable information rate" in the second and third paragraphs, and you represent this concept with the symbol R in the statement of Hartley's law. I am confused by a couple of things: (1) From a purley semantic point-of-view, what is the difference between "achievable information rate" and "channel capacity"? Is it not true that an "achievable rate" is the same thing as a "capacity"? (2) Later in the article, when you begin discussing the Shannon-Hartley theorem, you again use the symbol R, but now it represents the actual information rate, rather than the achievable information rate. To me, this use of a single symbol to represent two different concepts is ambiguous and confusing. I think it would be far less confusing to use the symbol C to represent achievable rate (or capacity) in both cases. First Harmonic 01:13, 5 September 2006 (UTC)
I'm not sure what you mean by semantic, but channel capacity is a statistical concept that Shannon came up with, which is not usually achievable as an information rate (except for error-free discrete channels, which are non-statistical), and achievable rate is a concept that Hartley had based on a bandwidth and what M could be achieved without error. It's subtle but important difference, since to attribute a capacity law to Hartley would be anachronistic, and since the rate equation uses an M that is not really a property of the channel.
Perhaps a subscript on Hartley's achievable R would be good to distinguish it from the more general R in the capacity discussion. Quite often such distinctions are not made, and the symbol R is used for rate in more than one context. To me, however, the bigger confusion is to put a C in Hartley's law and represent it as a capacity. But, some authors do so. I think a subscript M to indicate the dependence on M, rather than just on the channel, would be appropriate. Dicklyon 05:21, 5 September 2006 (UTC)


Test calculation for noisy transmission in Hartly's model: In the transmission model of Hartley, if one assumes M voltage levels with an equidistant distribution from -V to V that are uniformly randomly used, one gets an average signal power (here power is the square of voltage, the usual power differs by a constant factor) of \textstyle S=\frac13V^2\cdot\left(1+\frac2{M-1}\right). Now, a gaussian noise of average power N makes a transmission error with a standard deviation of \textstyle \sqrt{N}. Demanding for an error rate of less than 1% requires that the half-distance V/(M-1) between the voltage levels is greater than three standard deviations (actual error rate is 0.3%), or \textstyle N\le \frac19 \frac{V^2}{(M-1)^2}. A channel with those properties has a capacity greater then

C=B\cdot\log_2(1+S/N)   =B\cdot\log_2\left(1+\frac{3(M-1)^2}{1+\frac2{M-1}}\right)   \approx 2B\cdot\log_2(\sqrt3\,M).

For n standard deviations with error rate \textstyle \mathrm{erf}(\frac n{\sqrt2}) (see normal distribution), the factor in the last equation is \textstyle \frac{n}{\sqrt3}. This factor is (for large M) the ratio between the number of "virtual" voltage levels in Shannon's noisy transmission theory and the number of actual voltage levels in Hartley's transmission model.--LutzL 09:47, 5 September 2006 (UTC)

Shannon's noisy transmission theory: Shannon uses a sphere packing argument in K-dimensional space to derive the Shannon capacity as an upper limit for the data rate. This argument is simply dividing the volume of a ball of radius \textstyle \sqrt{S+N} by the volume of a ball of radius \textstyle \sqrt{N} to get a bound for the number of distinuishable codes of lenght K. This is to be compared with the number \textstyle M^K of different codes using independent symbols with M voltage levels. Hence the number of \textstyle \sqrt{1+\frac{S}{N}} "virtual" voltage levels. The case K=1 corresponds to Hartley's model.

To get a lower bound Shannon computes in dimension K the probability estimate \textstyle p\le \sqrt{\frac{N}{S+N}}^K for the probability p of finding a random point on the sphere of radius \textstyle \sqrt{S} inside the ball of radius \textstyle \sqrt{N} centered on a fixed point of radius \textstyle \sqrt{S+N}. The probability that all but one of C random points of radius \textstyle \sqrt{S} lie outside the small sphere is \textstyle (1-p)^{C-1}\ge 1-(C-1)p \ge 1-Cp. If the C code points are choosen randomly then the average error probability of finding more than one codeword close to the received message is \textstyle e=1-(1-p)^{C-1}\le Cp. The average number of bits per symbol has therefore the estimate

\textstyle \frac{\log_2 C}{K}   \ge\frac{\log_2 e-\log_2 p}{K}=\frac12\log_2\left(1+\frac SN\right)-\frac{|\log_2e|}{K}.

Using this estimate on the previous example results in an average increase by 0.79 − 8.4 / K bits/symbol using Shannon's transmission method. This is only positive for K>10 and results in a gain of one bit every two sybols for K>28.

Shannons argument uses several times the effect of high dimensions on the statistic of the radius of a multidimensional normal distribution (see chi square distribution). Thus it is only valid for rather large values of K. --LutzL 13:54, 5 September 2006 (UTC)


I think the reason some authors use the word "capacity" instead of "achievable rate" is because in common, everyday usage, the two terms mean more or less the same thing. I am not saying that the quantity expressed in Hartley's law is equivalent to the channel capacity as formulated by Shannon, and obviously Hartley could have no way of knowing what Shannon would do 20 or 30 years down the road. What I am suggesting is that Hartley's law is more of a precursor to Shannon's work than the article suggests because Hartley's quantity is in fact a capacity, not Shannon's channel capacity, but Hartley's notion of capacity. The two are different because Hartley's is based, as Dicklyon pointed out, on assuming an idealized M-ary pulse rate channel rather than a AWGN channel. But what I am arguing, and some authors agree, is that Hartley's law provides an expression for a channel capacity, and not a data transmission rate. First Harmonic 12:01, 5 September 2006 (UTC)
Furthermore, later on in the Wiki article, the authors have set Hartley's expression equal to Shannon's channel capacity to find a relationship between M and the S/N ratio. If Hartley's is a rate, and Shannon's is a capacity, then it wouldn't make a lot of sense to set the two equal to each other. First Harmonic 12:04, 5 September 2006 (UTC)
Finally, suppose I have a channel of bandwidth B with a signal-to-noise ratio that can support up to M distinguishable voltage levels. Further, suppose that I choose to use a coding scheme where the actual pulse rate f is less than the Nyquist rate 2B. Likewise, suppose the actual number of levels L is also less than the number of distinguishable levels M. Then I would argue that the information transmission rate, the actual rate R, is given by
R = f \log_2(L) \,
and not by
R = 2B \log_2(M) \,
And so in fact, since the first expression is smaller than the second expression, and the second expression really represents an upper limit on the data transmission rate, then I should be able to transmit information reliably over this channel at my chosen transmission rate. And again, based on common everyday usage, that suggests the first expression is the actual tranmission rate, while the second expression is the capacity, or upper limit on the transmission rate. First Harmonic 12:11, 5 September 2006 (UTC)
One other thing: I am not saying that Hartley came up with the "right" answer for channel capacity, or that Hartley's capacity is equivalent to Shannon's capacity. All I am saying is that Hartley came up with an expression for capacity, not the expression for capacity, and that although his expression is not absolutely correct, he was certainly on the right track. First Harmonic 12:26, 5 September 2006 (UTC)
What you don't get is that Hartley's law is just simply a direct conversion of voltage levels into bits per sympol. It doesn't say anything about noise or errors. Those require a separate analysis that suggests that for very low error rates one can gain some bits per symbol by using a higher number of voltage levels and forming code blocks of several symbols. The error rate per symbol will increase, but with lowest distance matching in a randomly selected codebook the overall error rate will be equal or lower to the one assumed in Hartley's transmission method. The underlying transmission method does not matter up to this point. Only if one wants to give a rate of bits per time one has to take the nature of the transmission channel into account.--LutzL 13:54, 5 September 2006 (UTC)
Reading the article again in the light of these arguments I find them well represented in the present version.--LutzL 14:04, 5 September 2006 (UTC)
Hartley was certainly on the right track. But there was a long stagnant period and huge conceptual leap needed before Shannon got to the concept of capacity. Dicklyon 15:49, 5 September 2006 (UTC)
To DickLyon and LutzL: I respectfully request that you please re-read the arguments that I made earlier today (above), and respond to each of the points individually. Essentially you have said that you disagree with me, and the reason you offer for disagreeing is actually an echo of the some of the very arguments that I have been making. You also have not addressed one of the key arguments that I have made, which is related to transmission rates below the Nyquist rate with fewer quantization levels than the channel can distinguish. As I have stated (more than once), I believe that it is possible to reach a compromise that will satisfy both of us. It is starting to appear, however, that you do not share my optimism. Are you willing to meet me partway, or are you simply planning to dig in? First Harmonic 22:02, 5 September 2006 (UTC)
The following is a direct quote from the first paragraph of Hartley's paper entitled "Transmission of Information":
"What I hope to accomplish (...) is to set up a quantitative measure whereby the capacities of various systems to transmit information may be compared."
I find it interesting that Hartley himself used the term "capacities...to tranmsit information" and not the term "achievable rate to transmit information." First Harmonic 22:55, 5 September 2006 (UTC)
I agree. Very Interesting. I still think it would be unwise to confuse his "capacity to transmit information" with Shannon's more formalized concept of "channel capacity." Dicklyon 23:09, 5 September 2006 (UTC)
Then we are at an impasse. In my opinion, as it stands now, I believe that the article is factually incorrect. I have made more than one fairly good arguments (though not ironclad) to support my opinion. You have not done much to refute anything that I have argued, and your current position is only that what I want to do is "unwise," which is a fairly subjective argument. I have made numerous attempts to offer you an opportunity to reach a compromise that would make us both happy, and you have not responded even once that you are willing to consider anything that I have to say or to back off even slightly from your position. I have even gone to the original source (Hartley's 1928 paper) and found that Hartley himself proclaimed that the very purpose of his paper was to establish the capacities of various transmission systems, in direct contradiction to your claim that Hartley had no notions related to capacity. And yet, it appears that you are still unwilling to consider anything other than what you already know to be true. I really am confused. First Harmonic 00:24, 6 September 2006 (UTC)
I don't think it's quite at that point yet. Let's talk. If you want to put the word capacity in there for Hartley (which I did already), just be careful not to confuse it with Shannon's meaning. And as to the multi-level below-the-Nyquist rate information rate, that's a possibly interesting bit to add, but is not Hartley's law. See if you can work it in as background. Please remind me if there are other points you see a response to, as it takes forever to try to scan the discussion and see what you think I dropped. And I can't speak for LutzL. Dicklyon 01:17, 6 September 2006 (UTC)
Thank you for having an open mind and giving me a fair chance. I need to think about how best to approach it. I will probably take some time before I can present a straw-man for consideration. Keep an eye on this page for a proposal. Thanks again. First Harmonic 03:39, 6 September 2006 (UTC)
Okay, I made a bunch of changes to the article last night and this morning. There is still more that I want to do, but it's a start. Take a look, see what you think. If you don't like something, go ahead and change it or improve it. Thanks. First Harmonic 13:23, 6 September 2006 (UTC)
I'll look more carefully later, but so far I like what you've done. Thanks. Dicklyon 19:40, 6 September 2006 (UTC)


I think that Hartleys "capacity to transmit" is meant as a quality, the same as "ability to transmit". To compare different systems with respect to that quality one needs to put them on a scale, that is assign a quantitative measure to that quality. It turned out that Hartleys proposal was right on track, even if it mixes a theoretical with a practical quantity.
Indeed, and in Hartley's approach the scale of bits per second was parameterized by B and M. In Shannon's it was parameterized by B and S/N. These can not really be put into alignment other than by comparing what M corrresponds to what S/N when the bits-per-second numbers are equated, even though their meanings are not quite the same. Dicklyon 16:04, 7 September 2006 (UTC)
Transmitting via a real world system at the Nyquist rate 2B and at the same time ensuring the frequency bound B will lead to noise by ISI - inter-symbol interference. Since this is a systematic disturbation it may be reduced by a digital post-filter adapted or calibrated to the channel characteristics. IMO, one would then, before changing the physical setup of the system, go for error correcting codes that reduce the transmission rate as well. But then I don't now more of Hartley's paper than cited here.--LutzL 06:55, 7 September 2006 (UTC)
Actually, Nyquist showed that 2B is the highest pulse rate such that zero ISI can be ensured, so ISI is removed from the issue in these cases. The picture can always be made more complicated by invoking phrases like "real world", but within the mathematic model that their proofs apply to, this is the truth. And of course you are right that if there is any Gaussian noise, you do need to go to error correcting codes to get arbitrarily close to zero error rate; by Hartley didn't know that yet (or did, but didn't have a mathmatical model for it). Dicklyon 16:04, 7 September 2006 (UTC)

[edit] Hartley's law again

First Harmonic, your edits look good. I'm assuming that since you added the Wozencraft and Jacobs citation, the equation you used is probably from there. I'll check mine tomorrow at work to verify. But if so, then a reference link from that sentence to the book would be in order, yes? Dicklyon 01:58, 13 October 2006 (UTC)

Yes it is in Wozencraft & Jacobs, as I indicated in the Edit Summary (see the History page). First Harmonic 02:55, 13 October 2006 (UTC)
Thanks. I should have checked your diffs and history. Dicklyon 03:30, 13 October 2006 (UTC)
Yes, a reference link would be fine, although I don't think it is necessary since I added W&J to the list of references at the end of the article. Also, I have no idea how to create such a link, nor is it clear that WP has a consistent or well-established policy on how to deal with footnotes and references. There are several different approaches in many different articles, and I am not really interested in sorting it out. But feel free to take a stab. First Harmonic 02:55, 13 October 2006 (UTC)
I know what you mean. I do feel free, but I may not take it on either. Dicklyon 03:30, 13 October 2006 (UTC)
Personally, I like the way that the IEEE does references in its articles, where the references are listed at the end of the article either in alphabetical order by author or in the order in which the article cites the references. Each reference is numbered from 1 to N, and the citations within the text simply mention the number of the reference, usually enclosed within square brackets, as in [3]. But that's just me. First Harmonic 13:57, 13 October 2006 (UTC)
BTW, the official WP policy is here if anyone is interested. First Harmonic 14:00, 13 October 2006 (UTC)

[edit] Shannon Limit

This page redirects from "Shannon Limit" but never once uses the phrase "Shannon Limit". That makes it not very useful for someone trying to find what "Shannon Limit" means.

Google says "Shannon Limit" appears 115,000 times but "Hartley theorem" is 534 times, including many of these being reprints of Wikipedia. (For google scholar -- i.e. recently real research -- it's 5700 to 70. The IEEE journals list 15 articles entitled "Shannon limit," one entitled "Shannon-Hartley" and none entitled "Hartley theorem"

Absent evidence to the contrary, the standard terminology appears overwhelmingly to be "Shannon limit". (Perhaps because whatever Hartley did, it was Shannon's 1948 papers that launched the research that has brought communications close to that limit).

So we need a separate article about Shannon Limit -- that's what people will be looking up and wanting to understand. If someone wants to say "Also called the Shannon-Hartley Theorem," or "builds on the work of Ralph Hartley," that's fine -- but the field calls it the Shannon Limit

The term "Shannon limit" appears so frequently because it means so many different things. In general, if someone says their system performs "close to the Shannon limit", they mean close to the bounds imposed by Shannon's information theory. It is used for the capacity of a bandlimited gaussian noise channel, like the Shannon–Hartley theorem, but also for binary symmetric channels and other channels, and for source coding (lossless data compression). One very common use is for the required energy per bit or SNR to achieve reliable transmission in the presence of noise; sort of the dual problem of finding a rate limit given an SNR. So, I don't see how you can have an article on the "Shannon limit" since it's not a defined concept. You have articles on information theory and this one a particular theorem of information theory. If there's some other article that appears to be missing, by all means start it. Dicklyon 04:43, 29 October 2006 (UTC)

[edit] Merge with "Noisy channel coding theorem" article?

Speaking of the Shannon limit, the Noisy channel coding theorem article also claims to be the Shannon limit article. A sentence in the Shannon-Hartley theorem article says the Shannon-Hartley theorem is a "appplication of the Noisy channel coding theorem". When I read the noisy channel theorem page, however, I find it is essentially the Shannon Hartley theorem, replete with many of the same formulas. Shouldn't these two articles be merged? Or are there really two concepts here and we should keep them distinct? What do other folks think about a merge? -- technopilgrim 20:23, 30 October 2006 (UTC)

That's definitely a better place for shannon limit to redirect to; so I changed it. The coding theorem is much more general. The Shannon–Nyquist theorem is the application of it to the case of a bandlimited continuous-time channel with additive white Gaussian noise, which is quite specific. I oppose a merge. Dicklyon 21:20, 30 October 2006 (UTC)
The articles make quite clear the relationship between the two. The Noisy channel coding theorem establishes that, in principle data can be sent without error at a rate up to the Shannon channel capacity C. This is a fundamental result about the meaning of channel capacity in information theory, regardless of the channel or the noise process. On the other hand, the Shannon-Hartley theorem is about the calculation of C specifically for the case of a continuous channel with a Gaussian noise process, and then specifically applying the result of the noisy channel coding theorem.
The two are distinct, and well factored into their two separate articles. Merger would be distinctly unhepful. Jheald 21:23, 30 October 2006 (UTC)

[edit] Can Shannon-Hartley explain different uplink and downlink speed in V.92 modems?

A V.92 downlink can handle 56kbps in the downlink but 48kbps in the uplink. Can this difference be explained by different information capacity according to Shannon_hartley in the uplink and in the downlink? In the downlink, the modulator has digital interface, and utilizes the PCM system for sending one symbol per sample. Perhaps the inter-symbol-interference may be lower in this case, since the modem symbols are synchronized with the PCM sample instants? Perhaps the maximum signal strength is lower, since it may be hard to identify the maximum possible amplitude?

What is the S/N in a PCM system, if there is no noice? Is there a difference between Europe and America because of different PCM systems?

I think it is more appropriate to use the Nyquist capacity limit instead of the Shannon limit when discussing the PCM case, since we now the number of possible levels. However, note that Shannon gives the net bit rate with an ideal error correction code, while Nyquist the gross bit rate.

We may assume that that we could use all N=256 levels, resulting in a gross bit rate capacity of 2*B*log2256 = 16*B bit/s, where B is the analog bandwidth in Hertz. If we assume that the bandwidth is B=4,000 Hz (i.e. ideal filtering), the modulation would result in according to Nyquist we could get 16*4,000 = 64,000 bit/s. If we instead assume only 3,400-300 = 3,100 Hertz Bandwidth, we would get 49,600 bit/s. In practice a frequencies outside the passband may also be utilized. The V.92 maximum downlink speed corresponds to 56,000/16 = 3500 Hertz Bandwidth.

N = 256 levels, without any noise, gives the same bit rate as the Shannon limit would give, if the SNR was 20 log10 256 = 48 decibel.

Mange01 23:29, 30 November 2006 (UTC)