Talk:Gini coefficient/Archive 1

From Wikipedia, the free encyclopedia

Archive This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Contents

Figure Incorrect?

According to the map showing gini coefficients for all countries, Greenland has a gini coefficient that is < 0,25. However, Statistics Greenland gives these coefficient in their latest publication on income (based on 2004 data): income - 0,46 income after taxes - 0,44 disposable income (includes social benefits from the state) - 0, 41 For those of you who read Danish, see here for further info: www.statgreen.gl

I have found data from the Census Bureau that conflicts with the GINI values in the diagram. The figure appears to be incorrect. For example, the US did not have a GINI lower than .4 after 1977.

I am pretty sure this map is wrong, dated, or both. Russia has at least 40%, according to the Wiki page on Postsoviet Russia, and Hungary is certainly no greenland-like outlier in Central Europe! varbal 00:23, 12 September 2006 (UTC)


hee I have a question about calculating the gini-coeffiecient: at this site they say you can calculate the gini-coeffiecient by A/(A+B) bud my question is how do you calculate A and B?!

How is your integral calculus? If you have curves available, you integrate the area under each curve. If no curves have been created yet, then you need to construct them. mydogategodshat 16:36, 11 May 2004 (UTC)

javascript:insertTags('Image:','','Example.jpg'); Embedded image


Yes I think what you mean is: for instance suppose, in a perfectly egalitarian sociaety that everyone has the same income. The curves are degenerate. The correct way to think of this is as probability distribution functions and the gini coefficent is a measure of non-uniformity, such as Renyi entropy (but most certainly not Shannon entropy). I'll think about this.CSTAR 22:50, 17 May 2004 (UTC)


I added some information on Gini coefficients in the U.S. It'd be better to have it for other countries as well — does anyone have that sort of data? Factitious 16:00, Oct 13, 2004 (UTC)

  • Yes, they are in the UN Human development report linked in the page - I will add some... - Marcika 22:29, 18 Nov 2004 (UTC)

It seems, the map used is a little outdated, as it shows Germany still divided in two countries, which is obsolete since October 3rd, 1990. —Preceding unsigned comment added by 141.113.85.21 (talk) 08:16, 4 October 2007 (UTC)

Long tail

I suggest that the following sentence be removed:

"There is an implication built into the Gini coefficient that a straight-line distribution is a desirable outcome, which in the newly evolving long tail economics may not be the case."

First, I see no such implication. Second, "the newly evolving long tail economics" is far from achieving widespread recognition. Third, the comment is highly speculative. TomSlee 17:50, 26 Jun 2005 (UTC)

I strongly agree: the quoted sentence should be removed from the article. One further point. There is a big problem about which raw data should be used for calculation. In particular, survey data on household expenditures yield much higher Gin coefficients than national accounting data. There are arguments for/against each choice. This should be mentioned, and then the choice underlying the data given in the article should be stated. --Mario 12:09, 16 July 2005 (UTC)
I removed the "long tail" phrase. I read the long tail article and it gave no indication of how that idea relates to the Gini coefficient and wealth distribution. AdamRetchless 18:54, 5 August 2005 (UTC)

---

What are the advantages of using Gini coefficient instead of the variance?? I think this should be pointed.

Moreover I do not understand this sentence "The small sample variance properties of G are not known, and large sample approximations to the variance of G are poor. ". Is this unclar, or is it only me?

---

Are you sure the formula is correct?

I think it should be (X_{k+1} - X_{k}) * Y_{k+1}

Y is saied to be "cumulative" already, so I dont see why you would sum Y_k and Y_k+1. alternatively you could multiply to (Y_k + y_k+1) ; where Y_k is cumulative until k and y_k+1 is the exact value for the k+1 sample

---


A simple description is missing.

I can find no place in the article that gives the value of a perfectly "equal" distribution. A reader might think it is .45 or 0 or 1. A clarification should be in the overview.

A clarification is in the overview. The third sentence of the article reads: "The Gini coefficient is a number between 0 and 1, where 0 corresponds with perfect equality (where everyone has the same income) and 1 corresponds with perfect inequality (where one person has all the income, and everyone else has zero income)." I think it cannot be made much clearer. -- Marcika 14:35, 27 July 2005 (UTC)


VOTE!! - HDI in Infobox#Countries|country infobox/template?

The Human Development Index (HDI) is a standard UN measure/rank of how developed a country is or is not. It is a composite index based on GDP per capita (PPP), literacy, life expectancy, and school enrollment. However, as it is a composite index/rank, some may challenge its usefulness or applicability as information.

Thus, the following question is put to a vote:

Should any, some, or all of the following be included in the Wikipedia Infobox#Countries|country infobox/template:

(1) Human Development Index (HDI) for applicable countries, with year;
(2) Rank of country’s HDI;
(3) Category of country’s HDI (high, medium, or low)?

YES / NO / UNDECIDED/ABSTAIN - vote here

Thanks!

E Pluribus Anthony 01:52, 20 September 2005 (UTC)

Effect of adding populations

User DL5MDA made some remarks on the effect of calculating the index separately for partial populations or for the whole together. They were incorrect. See for example the extreme case of two regions, each of which has perfect equality of income. However one in one region each person earns double the income of a person in the other region. Assume that equally many persons live in both regions. Now join the regions together. Everyone in the poor region will be on the left half of the curve, reaching to total 1/3 of total income. The remaining 2/3 of income are in the right part of the curve. A simple calculation shows that the index will now be 1/6 (about 17%). So merging these two populations with index 0 each, yields an index of 17% together. −Woodstone 12:14, 24 September 2005 (UTC)

Disadvantages

I don't know what this quote means:

  • The Gini coefficient is an often abused measure, ie it is often used to imply that one value is better or worse then another. This is not the case as other then the very extremes in most cases there is no way to decide if any number if better or worse then any other.

Any measurement can be "abused" -- is there something about the Gini that makes it more vulnerable to abuse than any other statistic? Afelton 17:51, 1 November 2005 (UTC)

Actually, yes; it condenses the Lorenz curve into a single number that hides a great deal of information. Extremely different shapes of Lorenz curves can give the same Gini coefficient, and those who do not understand the Gini coefficient often assume that different countries with the same Gini coefficient have similar income distributions. This is just one among the many ways in which the Gini coefficient can be used in misleading ways. The Gini coefficient can be very useful, but it needs to be properly used, and it often is not. —Lowellian (reply) 12:14, 15 March 2006 (UTC)

Basically it's not accurate to say that "inequality has increased" JUST because the Gini coefficient went up. If two different Lorenz curves cross then inequality will have increased at one end of the distribution but decreased at another. In that case one cannot rank the two distributions in terms of inequality based on the calculated value of the coefficient without making further assumptions about what 'inequality means' (this is essentially because when you summarize an entire distribution with a single number you loose some information). However, in practice this is often ignored, even by acedemic researchers. I think there should be something in the article to address this fact.

About this point:

  • Economies with similar incomes and Gini coefficients can still have very different income distributions. This is because the Lorenz curves can have different shapes and yet still yield the same Gini coefficient.

It would an interesting and practical addition to the article to cite a few pairs of countries with similar GDP and Gini coeff but different income distributions. R4ubix (talk)

People or households?

The definition at the beginning of the article is:

"...It is a number between 0 and 1, where 0 corresponds to perfect equality (e.g. everyone has the same income) and 1 corresponds to perfect inequality (e.g. one person has all the income, and everyone else has zero income)." (My bold).

The definition in The Economist's Essential Economics is:

"...It varies between zero, which indicates perfect equality, with every household earning exactly the same, to one, which implies absolute inequality, with a single household earning a country's entire income." (My bold).

Is there a diffence between "people" and "households"? Which is correct? Tamino 08:04, 3 May 2006 (UTC)

Both are incorrect, of course; the number will never reach one, even if there is a single person in a single household (though it will be very very close).
My understanding is that, technically, the income to be used for calculating the Gini coefficient should be supplemented with an imputed income/loss of income due to other members of a household — I don't really know what's done in practice, but i wouldn't be surprised if a constant household size were assumed.
RandomP 15:49, 1 July 2006 (UTC)


People or Households, which is correct? It depends, neither is right or wrong all the time. It depends on how and in what context you use it. The Gini Coefficient is like any other descriptive statistic. You wouldn't ask generically: average income per household or average income per individual, which is correct? And you wouldn't hear a sports fan ask generically: which is correct, average points per game for a team or average points per game for an individual player? It depends on what you want to do. Just be careful about mixing apples and organges.
Up to but not including 1: Random P is correct. That is addressed in the mean difference article, which is a little more technically detailed and precise than the Gini coefficient article. For example, the statement about being between 0 and 1 also depends on negative values not being allowed for the underlying measured values. -DCary 00:08, 3 July 2006 (UTC)

Calculation

The supporting details for the Brown formula don't make sense. X k is being used on the left side to denote a cumulated amount, while X m is being used on the right side to denote a non-cumulated amount. Since m runs from 1 to k, this appears to be a circular or implicit definition of X k , but it is not supposed to be. Likewise for Y k .

It would be nice to explicitly list separate formulas or explain the application of formulas for:

  • a numerical approximation to the true value. (This appears to be one of the uses of the Brown formula.)
  • a population (applicable especially to small populations)
  • a discrete probability function (the article on the Lorenz curve does not cover this case)
  • a sample from a population.

DCary 21:19, 25 May 2006 (UTC)

You are right: Xn and Yn are used in two conflicting ways. I removed the unnecessary and faulty formulae that were added at some point in time. −Woodstone 21:48, 25 May 2006 (UTC)
And yes, it would be interesting to see the gini coefficient of a normal distribution. Might take a while to find out. −Woodstone 21:51, 25 May 2006 (UTC)
I calculated the Gini Coefficient for a normal distribution with a mean of 1 and standardard deviation of 1: G(N(1,1))= 0.56418958. That means that for an arbitrary mean m and standard deviation s, G(N(m,s)) = 0.56418958 * s / m. Not tremendously difficult if you have some of the basic formulas. I'll work on adding them to the article. −DCary 02:39, 1 June 2006 (UTC)

The statement about multiplying the Gini coefficient of a sample by n/(n-1) to get an unbiased estimator of the population value is wrong. It needs to be removed or qualified in some way. In fact, it appears not difficult to show that it is impossible in the general case to calculate from a sample an unbiased estimator for the population value. −DCary 22:33, 31 May 2006 (UTC)

The statement "large sample approximations to the variance of G are poor" needs some clarification. What is meant by "large sample approximations to the variance of G"? In what sense, by what measure are they poor? −DCary 22:33, 31 May 2006 (UTC)

I removed the Brown eponymy for the formula based on the trapezoid rule because it is a straight forward application of the trapezoid rule which is a generic math formula, the only association I could find of a Brown with the formula was in (Brown, 1994), and the formula for approximating the Gini coefficient has published uses at least as early as (Morgan, 1962). If there is a good reason to name the formula after Brown, please explain.

Material that addresses some of the other issues in this section of discussion was put in the new article about the mean difference and relative mean difference. -DCary 16:19, 27 June 2006 (UTC)

"Note how this corresponds to"

"Note how this corresponds to the lowering of the highest tax bracket, for example, from 70% in the 1960s to 35% by 2000." I don't understand this sentence. It sits outside any other paragraph. What does "this" refer to?

"This" refers to the rise in Gini coefficient from 0.394 in 1970 to 0.469 in 2005.--Patchouli 23:17, 28 October 2006 (UTC)

It seems this was changed to: "Some argue this rise corresponds to the lowering of the highest tax bracket, for example, from 70% in the 1960s to 35% by 2000." I don't edit wikipedia often, but if I knew how I would put a little "citation?" next to that. "Some" argue anything - this seems to me the height of journalistic weaseliness. Can you point towards an economic study that plausibly links the two? In fact, there are many arguments as to why US inequality is increasing - a popular theory of course is globalization; the other biggie is technological progress. Perhaps a more fleshed-out section with citations would be in order. If not, take it out entirely as it is pretty misleading. 18.214.1.179 17:08, 12 January 2007 (UTC)thewhiterabbit11

Credit risk use

Thank you, Bluemoose, for pointing the fact, that there is needed citation for the use of Gini coefficient in the credit risk modelling. For people working there it is one of basic tools, usually we hear at meetings things like "model has Gini of 73.46 %, it is quite well performing". But well, just quick googling of "gini coefficient credit risk" gives you so many citations... E.g. in [1] on page 14 you can find: "The K-S statistic and the Gini coefficient are common measures of a model’s ability to separate risk." Separate risk = discriminate between good and bad. And so on. It is really very basic tool. --Ruziklan 10:49, 28 September 2006 (UTC)


The use of the word "optimal" in "optimal Gini coefficient"

It should be pointed out that "optimal" here means optimal in respect to growth. Growth is not necessesarily the only goal of the society. In fact, studies show that the happiest country in the world is Denmark, which also has the second lowest gini coefficient. Perhaps income inequalities make people unhappier.

Graph axes

I'm struggling a bit trying to understand how to apply the gini coefficient to income/wealth equality. Looking at the graphic included in the article page has thoroughly confused me. Can I ask for someone to explain those axis descriptions for me please? "The cumulative share of people from lower income" from 0% to 100% graphed against the "cumulative share of income earned"? dpotter 21:58, 1 December 2006 (UTC)

You are looking at a Lorenz Curve. Try looking at that article for an explanation of the axis. In general, would it be helpful for the Lorenz curve article and/or the Gini Index article to give a small example, perhaps with just 4 people/data points? DCary 03:37, 19 February 2007 (UTC)

Is this really a disadvantage of the Gini coefficient?

Currently, the article says this:

Comparing income distributions among countries may be difficult because benefits systems may differ. For example, some countries give benefits in the form of money while others give food stamps, which may not be counted as income in the Lorenz curve and therefore not taken into account in the Gini coefficient.

To me, it sounds like it is the person who fails to include food stamps as an income who is making the error, not the Gini coefficient or the Lorenz curve. —Bromskloss 18:39, 12 January 2007 (UTC)

0 is not "the same income"

It seems that the article has an error. It says: "Here, 0 corresponds to perfect income equality (i.e. everyone has the same income) ...". I think that "perfect income" ( 0 ) doesn't mean the "same income". According to http://en.wikipedia.org/wiki/Lorenz_curve it means that "the bottom N% of society would always have N% of the income". I don't know how to edit that part of the article. 209.50.173.162 19:02, 16 February 2007 (UTC)

There is no discrepancy. When a Gini index of income equals 0, everybody has the same income, and that is also when the Lorenz curve is the "line of perfect equality": the bottom N% have N% of the total income for every N.
Try constructing a 2-person example that illustrates "0 is not 'the same income'" if you still think there is a problem. DCary 03:30, 19 February 2007 (UTC)

Weasle words in history of Gini by country

This section has a "Some say...other's say" structure.

I'm no economist, but the following seems obvious to me about the US Gini coefficient curve: The Gini coefficient was roughly constant or decreasing until 1980 under Democratic and Republican administrations. This included periods of high economic growth and low economic growth. The Gini coefficient began a rapid rise with the advent of the Reagan administration which explicitly believed in supply side economics. Possible causes:

  • Change in tax structure (mentioned)
  • Deliberate weakening of unions (e.g. firing the air traffic controllers and replacing them with permanent replacements)
  • Changes in nature of economy, e.g., high technology

Many other advanced countries, exposed to the same technological changes as the US, did not follow these trends:

  • France
  • Japan
  • Germany

These countries often had higher growth than the US during the Reagan administration despite their lower Gini indices.

Other advanced countries did follow similar economic policies as the US and showed similar changes in Gini index:

  • Britain

The Gini index in the US ceased its growth with the advent of the Clinton administration in 1992. There is no data on the change in the Gini index under GW Bush.

Therefore, I conclude that the changes in the Gini index in the US have been due to the effects of change in policy or changes. This may be good or bad: the extra inequality in the US may be tied to its higher growth rate. David s graff 23:05, 23 February 2007 (UTC)

New related page -- Suits index

Interested parties are invited to improve a related article at Suits index. I'm new here so just delete this comment if this was inappropriate for talk. --Perkinsms 21:01, 17 May 2007 (UTC)

Disadvantages section needs rework

In the disadvantage section, several of the subpoints seem to be there to make the disadvantage section seem longer. I don't think that, when comparing the gini coefficient to other forms of statistics, "As for all statistics, there will be systematic and random errors..." is a unique disadvantage to Gini. Citing disadvantages to all forms of statistics is also not common practice among other statistical measures across Wikipedia

Also, many of the subpoints lack implications. For instance, the first subpoint claims that it is bad that Gini would give the US a worse value than countries in the EU because bigger/more diverse would mean more inequality, but it would seem to me as though any measure of inequality would (and should) yield the same result.

I think you're right on the first point. On the second - well that precisely is the problem. The Gini index is not decomposable, unlike for example the Theil index. So this one's a valid criticism.

Pronunciation

Would someone more in than me please add to the first paragraph of this article how English speaking economists normally pronounce "Gini". It is not in any of my dictionaries and I can't find it anywhere on the web. "Rhymes with" would be a good start. With thanks --Kjb 21:10, 27 July 2007 (UTC)

It's italian, so it would be pronounced (Italian_alphabet) /ˈdʒini/, roughly like 'genie'.


Extending graph of Gini indices over time backwards

Is it in any way possible to extend the graph of Gini indices over time to before 1950 and preferably to before World War I??

I am very curious as to the effect Gini indices have on social liberalism/conservatism, especially regarding the support for socialism in Europe. Looking very closely at the graph I do see the possibility of extremely high values in pre-World War II Europe which would fit in with the support for socialism there even if costs of living were not as high as they are now.

luokehao

Outdated map

The map used to show national Gini coefficients seems to be about 20 years out of date, since it shows East Germany. This implies that the values given are also out of date. If anyone can find a more up-to date map, this would be welcome. WikiReaderer 19:36, 24 August 2007 (UTC)

I agree, the current Gini coefficient for Germany is 0.344 - see German Statistic Yearbook 2007 (data on Gini is from 2003). It seems better to remove the map than to let it continue to give false impressions to readers. —Preceding unsigned comment added by 85.177.140.210 (talk) 20:12, 16 October 2007 (UTC)
The delay (until income data are stable) partially is due to the lengthy procedures for tax refund in Germany. By the way: The latest yearbook just appeared in this month (October). One akward thing with this book is, that you won't find quantiles for computation of inequality measures under "Einkommensverteilung" (income distribution). You will have to look for the chapter on taxes and revenues. And there is a confusing multitude of "Ginis" in varoius publications. The yearbook stays away from computing inequality coefficients, but you can compute them yourself, e.g. like for 2001 pre tax income distribution and income distribution after taxes. DL5MDA 01:08, 18 October 2007 (UTC)

Diagram incorrect

In the first diagram of the article the area shaded yellow is labeled "Gini index". But the Gini index is the RATIO of the yellow area to the area under the diagonal, or 2 times the yellow area. You can also see it from the formula for the Gini where the Lorenz curve is integrated. So the label should be "1/2 of the Gini index" or just dropped. —Preceding unsigned comment added by 75.134.157.26 (talk) 04:59, 18 October 2007 (UTC)

The Gini coefficient is the yellow area, if the area of the whole triangle is defined as 1, rather than the area of a square twice the size of the triangle. The diagram is correct (although I might include a note on this point). Marcika 14:55, 31 October 2007 (UTC)

"optimal Gini coefficient": please defend.

The "optimal Gini-coefficient" section, at least, is in severe need of a rework: it appears to be based on a single empirical "study" which in turns appears to be concerned pretty much exclusively with claiming that the difference in economic development in the 20-year period between Sweden and Ireland is due to their different policies, and advocating the Irish policy over the Swedish one.

Common sense would suggest that the correlation demonstrated could just as easily be due to chance or, even more likely, due to a third effect not considered in the "study".

I'm not even sure the document can be used as a reliable source. It appears extremely dodgy to me. Note that the basic statement, that, all other things being equal, strengthening a redistribution system to lower the Gini coefficient below a certain value is likely to have overwhelming negative effects in some fairly simple (and working) economic models, is perfectly okay. The "Sweden would have a wonderful economy if only they raised their Gini coefficient" statement hinted at in the study and the section is quite ridiculous, though.

Statements like "Extreme egalitarianism leads to [...] corruption in the redistribution system" appear to me to warrant removal rather than qualification, simply for the way they misrepresent causality. Certainly you could criticise various redistribution systems for being amenable to corruption, and some of those might lead to extreme egalitarianism, too, but that's quite a different kettle of tea.

Again, the "study" used as a reference rings alarm bells on many fronts (typos (in addition to those spelling mistakes I assume were caused by overly direct transcriptions from the authors' first languages), the treatment of inflation, near-total lack of academic credentials for the authors). Unless a fairly good defence is coming up, I'm tempted to treat this as a vanity reference.

RandomP 00:10, 1 November 2006 (UTC)