User:Future Perfect at Sunrise/Fair use

From Wikipedia, the free encyclopedia

Contents

[edit] Are we losing the battle?

I have only recently got involved in image patroling as an admin. And my first impression is: I'm appaled. At the sheer size of the problem. We are confronted with a deluge of bad image uploads. Many admins and conscientious users are doing their best to stem the tide. But at the moment, it seems we are not coping too well. Are we losing the battle against copyright violations on Wikipedia?

Some little attempt at statistics. At the moment of writing, 16 February 2007, the English Wikipedia reported it had 701,095 uploaded files. (This includes videos and sound files, but the bulk is images). By the time you're reading this, the number has risen to 786,161.

I have no access to numbers of how many images were uploaded during any particular past period, only to figures of how many have survived from any given period. During January and February 2007, the number of images increased by c.1500 per day. This means that there were well over 2000 uploads per day, of which an unknown number was deleted shortly after. The following shows the number of surviving new images, as per 17 February, by age of upload, over the last 6 weeks:

Upload age Number surviving
<1 week 15,833 ████████████████████████████████
<2 weeks 15,220 ██████████████████████████████
<3 weeks 12,285 █████████████████████████
<4 weeks 11,925 ████████████████████████
<5 weeks 11,448 ███████████████████████
<6 weeks 10,948 ██████████████████████

The peak in the last two weeks represents the raw number of originally uploaded images and the subsequent drop through the onslaught of the first wave of speedy-deletion cleanup. The numbers in the other weeks probably represent partly the constant increase in raw upload counts due to the natural accellerating growth of wikipedia, and partly the retroactive cleanup efforts in weeding out older bad images.

During one random day in the same time period, administrators deleted approximately 900 images. Assuming this to be representative, we can estimate the number of deletions per week at about 5000. About half of these must be speedy deletions of recently uploaded images within the first two weeks; the rest is retroactive cleanup of older ones.

This means, roughly, that out of five uploaded images, one gets deleted quickly within the first two weeks.

But how good are the rest? How many of the images that survive the first purge are still bad in terms of copyright? I did a little test. "Bad", for the purpose of this statistics, means: obviously inadequate license/source tagging, recognisable at first sight as requiring admin intervention. This includes: missing source information; obviously implausible or false claims to free licensing or self-made status; missing or obviously inadequate fair-use rationales; self-contradictory tags; usage in articles obviously inadequate to fair-use declaration on image page; etc. Here's the result, with a random sample of 220 images uploaded during the last year, distributed across 11 sub-samples of 20 randomly selected images each, by age of upload:

Age of upload good bad
1 hour 3 ███ █████████████████ 17
2 days 9 █████████ ███████████ 11
1 month 9 █████████ ███████████ 11
2 months 12 ████████████ ████████ 8
3 months 11 ███████████ █████████ 9
4 months 11 ███████████ █████████ 9
5 months 7 ███████ █████████████ 13
6 months 12 ████████████ ████████ 8
8 months 9 █████████ ███████████ 11
10 months 6 ██████ ██████████████ 14
1 year 14 ██████████████ ██████ 6
Total 103 117

In other words: Half of all surviving images on the English Wikipedia are bad. By rough estimation.

And when I say "bad", I'm not talking about "replaceable fair use images", that is, those images that are likely to be legal but which the Wikimedia Foundation nevertheless wants to get rid of in the interest of the noble goal of having only truly "free" content. I'm talking of images that are problably not fair use at all. In other words, copyright violations.

What is interesting is that this proportion seems to be constant over time. If we admins were successful in weeding out bad images, there should be high proportions of baddies only among the youngest uploads. The further back one goes, there should be a noticeable drop. This does not seem to be the case, except for the things that get corrected within the very first two days. Among the images that have survived several months of supposedly constant filtering and review, the baddies are just as frequent as among the uploads just two days old. There are two possible explanations for this, one pessimistic and one optimistic. The pessimistic one is: The number of baddies we actually manage to spot and delete is negligible compared to the sheer volume of uploads. The optimistic one is: back a year ago the proportion of baddies among the original uploads was even higher than it is now, and our cleanup efforts in deleting the old ones has gone hand in hand with a relative improvement in the behaviour of uploaders. I can't say right now which of these hypotheses is closer to the reality.

In any case, with the present apparent rate of below 1 deletion in 5 uploads, we will not manage to get Wikipedia clean.

Frightened? You should be.

[edit] Fair use

Here's my take on "fair use" images. If I deleted or tagged an image of yours as invalid fair use and you want to complain or find out why, please read this.

[edit] So what is fair use about?

As the links above will explain in more detail, there is no hard and fast set of criteria of what is or isn't "fair use". That's what makes it so difficult. However, there's one important factor that is often overlooked in the application of the fair use doctrine by Wikipedians: Fair use must involve a "transformative factor" ([1]). The image must be used to do something new with it, something transcending its original use.

Most typically, it is "fair" if you use an image in order to be able to talk about the image, for criticism, explanation of what role the image plays in some larger context, etc. It is not "fair" if you use the image simply as a vehicle for doing what it was originally designed for anyway: illustrating whatever it it that it illustrates.

For example, imagine some guy takes a photograph of an event X. He might want to make money off it by selling the image rights to news agencies. The news agencies might buy it in order to create web pages about event X. We too, create a web page on Wikipedia about event X. By using the image on that page, we are doing the same thing it was designed for: illustrating X. Something that other people have to pay money for to do. Hence, not "fair". If our article was about the photographer and used for a discussion of his artistic style, it would be different.

[edit] Objections...

Here are some of the objections I have frequently seen about deletions of allegedly FU images:

[edit] But it's not replaceable!

The short answer is: tough luck for Wikipedia, but so what?

The long answer is: I find the criterion of "non-replaceability" (Criterion 1 of the WP:FAIR#Policy) often misunderstood or misrepresented, and our fair use image tags aren't doing a very good job at explaining them either. In fact, non-replaceability is not among the criteria that make an image FU. Non-replaceability is a self-imposed additional criterion defined not by FU law but by internal Wikipedia rules. It is a criterion that an image has to pass in addition to being FU. If it wasn't FU to begin with, then its being non-replaceable doesn't change a thing.

Quite to the contrary, even. If an image is genuinely unique or difficult to replace, that fact is likely to increase its commercial value for the person who owns it, and hence, it increases the potential damage we are doing to somebody else's property by pinching it.

Look at it this way. If you say: "I can't find any free picture to illustrate X, so it's only fair I should be allowed to take this one", that's no different from saying: "I covet my neighbour's house, or field, or ox, or donkey, so it's only fair I should be allowed to steal them." That's not the concept of "fairness" the "fair-use" doctrine is about.

[edit] But it's a photo of a unique event!

Same as above. The lucky bastard who managed to take a photograph of that unique event is probably still wanting to cash in on his luck. For instance, by selling the image to websites that wish to create pages about this unique event. Just like us.

The operative criterion for including an historic photograph under fair use is not that we want to talk about the event and there's no other illustration of it available. Fair use is only if we need to talk about the image itself (its iconic status etc.) and we can't reasonably do that without showing it.

[edit] But Wikipedia will be a poorer encyclopedia without these images!

Sure. Again, tough luck for Wikipedia, but so what? We all love to have as many good, informative, high-quality images as possible. But we must understand two things:

  • Wikipedia has no god-given right to have all the images it wants, to illustrate everything it wants to illustrate. If there isn't a free image of a certain object or person or event, then that's just it. No justification for stealing one simply because we want it. (Haha, only serious: You can always buy one from the photographer and donate it to us.)
  • The world won't end just because we have an article lacking a photograph to illustrate it. The best encyclopedias of human history were written entirely without photographs. Wikipedia will survive without this one. Life goes on.

[edit] But it's a promotional image! It's owners are happy for us to use it!

That's a tough one. Assuming for the moment that the claim is true in the individual case and the image is really from some press kit offered for use without payment, we have three different sets of conditions on usage:

  1. The owners' implied conditions, they being happy about us using the image for "free" (as in "free beer") under certain criteria (for instance, no modifications, no commercial re-use, etc.)
  2. The Wikimedia Foundations much stricter goal of only hosting material that is really "free" (as in "free speech"), i.e. free for all sorts of re-use, modification etc.
  3. The conditions of "fair use", which allow use under an entirely different set of criteria (for critical commentary, etc.)

What people are currently doing is this: Seeing the images don't match the criteria of (2), we silently rely on their matching criteria (1). But knowing that the Foundation doesn't like to hear of criteria (1), we just choose to lie to ourselves and to the Foundation and pretend we're under criterion (3). Which in 95% of all cases won't be true. In most of these cases, we're unlikely to be sued, because the copyright owners don't care about the finer distinctions between (1), (2) and (3). But it's still a lie.

Personally, I would much prefer it if the Foundation could just go easy on criterion (1), for instance by re-allowing images "used with permission" or "for non-commercial use only". Much better than all of us messing around with false fair-use claims. But the Foundation seems to be adamant on its strict stance in this respect, so we'll have no choice in the long run but to get rid of these images.

[edit] But it's used only for informative purposes!

Well, news agencies or other similar commercial web content providers are also using them "for informative purposes". And they still have to pay for them.

[edit] But it's of low resolution and therefore of less value than the original!

The fair-use doctrine for low-resolution items goes for things that are as small as real "thumbnails". For instance, it's fair for Google to create thumbnail copies of copyrighted images in its search result pages. The difference is: It's fair use if you use a thumbnail simply to identify which image you're talking of. It's not fair use if you use a smallish copy of an image to do just what the image is designed to do: illustrate the object it depicts. Test: As long as it's big enough that some commercial site might still find it useful as an illustration in a page illustrating the object or event in question, it probably still has a market value at that size. Hence it's not fair use, as long as we are doing nothing else with it than the hypothetical commercial site might be doing: namely, illustrating object or event X.

[edit] So, are you opposed to fair-use images in general?

No, most emphatically not. I'm all for having fair use images on Wikipedia. But only for purposes that are really, genuinely "fair use".

[edit] But you speedy-deleted my image out of process!

Could be. I'm pretty new at image-policing, and I'm still slightly confused about the rules of what can be speedied when. So I might have made a mistake about process. In such cases, I'll generally be happy to undelete and take the image to a proper review. But before you insist on that, please read the above, and the Stanford pages linked to above, and the policy page, and consider carefully whether your image has a snowball's chance in hell to survive that debate.