Wikipedia talk:WikiProject Missing encyclopedic articles/Archive 1

From Wikipedia, the free encyclopedia

Archive This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
This is an archive of the discussion at Wikipedia talk:WikiProject Missing encyclopedic articles.

Contents

EB-specific instructions

We tailored the instructions here specifically to the EB2004 project, but this project page has a broadened scope. They should be revised, but (as the original author) I hesitate to start because I think they are already somewhat too long, and by adding assistance specific to the other sources, they would become just mind-numbing. Does anyone with better organized instructional skills want to make a start at reorganizing them, maybe? David Brooks 02:58, 13 Jun 2005 (UTC)

Russian names bot

I noticed that Britannica uses full Russian names, complete with "father name", so in many cases a simple redirect is enough. Maybe a bot should be used to do that. It shoulkd do the following

  1. Find Britannica titles consisting of three words "Foo Bar Baz"
  2. Check whether "Foo" is a common Russian name
  3. Check whether the article "Foo Baz" exists in Wikipedia. Check also alternative spellings of "Foo" (Alexandr, Aleksandr, Alexander and such).
  4. Make "Foo Bar Baz" redirect to "Foo Baz"

I think this procedure would greatly reduce the numbers of missing articles.  Grue  16:25, 13 Jun 2005 (UTC)

Its a good idea, but i am not sure there are actually enough to warrant the effort, plus bots tend to make mistakes when even the smallest amount of intelligence is needed. Bluemoose 18:12, 13 Jun 2005 (UTC)

Template

WikiProject Missing Encyclopedic articles
(% done)
v  d  e
Project page - The goal of this project is to ensure that Wikipedia has a corresponding article for every article in every other encyclopedia. Sign in!
Monthly focus : OBI Biographies - OBI biographies are 98.3% done. Let's finish them off!
1911 verification - 2.6%
Hotlist of topics - 83%
General topics - 71.6%
Other Wikipedias: de es fr
Science topics - 43%
Catholic Encyclopedia - 72.2%
Easton's Bible Dictionary - 77.9%
Gutenberg authors - 48.2%
Jewish Encyclopedia
Literary Encyclopedia - 68%
OBI Biographies - 98.3%
Find-A-Grave - 58.3%
ACF Regionals answers
Miscellaneous
Many other lists of politicians, albums, films, TV shows and others.
Overall progress - 53.7%
Spread the word through {{Project missing articles}}

I just made this template, its for the main project page, the list pages and your user pages. I think it is a great idea, because when the weekly focus page and "Todo" gets changed, it will be changed on every instance of the template, and of course it will provide much advertising, it also provides all the useful links. I put it on here first because I'm sure it can be vastly improved/tweaked. thanks Bluemoose 18:12, 13 Jun 2005 (UTC)

I originally put this under the discussion for the template, but it didn't seem to garner much discussion. Anyway, should a numerical listing of remaining articles also be included in the template? It wouldn't be precise number like 3021, but ~3000 or <2500 would be appropriate. This would also help to identify projects that are close to being finished like Nuttall (<2000) compared with 1911 which has >15,000 remaining. This information is in each of the individual project pages, but also thought it was worth listing in the template. --Leonsimms 20:37, 3 August 2005 (UTC)
And to counter my own point, the template is starting to get a little bit unwieldly. It's pretty busy without conveying information in straightforward manner. I'm going to change the formatting a bit, but feel free to tinker with my changes Leonsimms 20:37, 3 August 2005 (UTC)
You were right to prune it a bit. The focus article idea hasn't really took off so I guess it is ok to remove that. Pcb21| Pete 21:09, 3 August 2005 (UTC)
Yes, good job. As for the numbers, there is already too much pruning/calculating %'s and other non-productive editting, I certainly dont think we need any more. Martin (Bluemoose) 21:14, 3 August 2005 (UTC)
Just as a word of warning for anyone who liked my first suggestion, I tried adding the numbers but I could never get the formatting to look right, it looked too crowded. Leonsimms 21:24, 3 August 2005 (UTC)

Duplicate list entries

I just added an instruction to this effect, but remember: don't consolidate duplicate entries in the lists (or, if you do, add a comment as I just did to /Q). They represent two articles with the same name but different topics. For WP they will need a disambig pointing to two (non-empty) content articles. David Brooks 18:26, 14 Jun 2005 (UTC)

"Purge" Bot

I have recently created the RABot to help manage requested articles pages (i.e. sort lists, remove created articles, cleanup formatting, etc.). With only minor modifciations, I could run it on all the encylcopedia pages you have here to wipeout the completed requests. Since it only operates under manual control, it never runs faster than about a minute per page (and considerably slower if I have to stop to correct mistakes), so it is not something that I am likely to do often, but occasionally going through and clearing out the encyclopedia request pages seems like a reasonable thing to do.

However, I didn't want to do so without coming here and making sure it would be okay with the community for me to clear out all the requests that have been fulfilled. Dragons flight 06:23, Jun 17, 2005 (UTC)

P.S. If you want to know how your encyclopedia requests compare to the other requested articles, you might enjoy taking a look at the stats page that I have created.

It sounds interesting, however, i think we would not want formatting to be changed (for Britannica articles at least), as the idea is that if someone searches for an exact britannica title, they will find it. I suspect formatting of the general topics does not matter, but dont quote me on that. In terms of removing the "blues" it sounds useful, but do you mean it removes one link a minute, or removes all the blue links off one page a minute? the later would be very useful. thanks Bluemoose 08:25, 17 Jun 2005 (UTC)
All links in one minute. I assume you have already sorted the lists in the way you want, so I would turn that off. After that, formatting is mostly concerned with whether white space and new lines are used consistantly, so probably not something you would be concerned with. However, before I go through and do all of the pages, I can do a couple and people can look them over to make sure they look right. Dragons flight 14:13, Jun 17, 2005 (UTC)
Be aware also that a number of the lists have been annotated, marking up existing articles that may contain some information, or articles that are not the same as the Britannica one. We would not want these removed. Where the bot may be very useful is in pruning the Nuttall Encyclopedia topics, most of which currently contain lots of blues. --OpenToppedBus - Talk 10:43, Jun 17, 2005 (UTC)
Your notation seems to be "* or # [[link]] - optional notes, new line". I would only be looking at that first link to determine whether the topic had been created and treating anything that came after it but before the next line as notes to preserve if the topic hadn't been created. Is that the kind of annotation you are referring to? The case of things that are marked as not equal to the encyclopedia topic would be harder. Is there any semi-standard notation that I can look for to determine this? This is a manually controlled script so in principle I can check all of these, but if there was a way recognize most of them automatically it would save time. For example, on RA, requests followed by the word "redirect" or "copyvio" are passed over since even though they may be blue they probably don't point at what the requester wanted. Dragons flight 14:13, Jun 17, 2005 (UTC)
All lists except EB2004 are ok i would imagine? Bluemoose 10:51, 17 Jun 2005 (UTC)
Could you clarify what the problem is with EB04? Is it just that you don't want blue links removed from that list (in which case I will skip it), or is there some formatting you are afraid will be disturbed (in which case I would teach the script to understand and perserve it)? Dragons flight 14:13, Jun 17, 2005 (UTC)
I think you would be safe to remove all blues on EB04 if they are not followed by a comment. But EB04 entries are followed by external links on a lots of pages, which might cause you problems, also it is kept fairly up to date anyway. Using it on the nuttall, EB1911 and general topics would, i imagine, be much easier for you, and much more useful for us, go ahead and try it. thanks Bluemoose 14:20, 17 Jun 2005 (UTC)

Okay, some time over this weekend (hopefully) I'm going to put together a sample by processing two pages of each encyclopedia except EB04. Assumming there are no complaints, and I wouldn't expect there to be any, then I'll go through and do the rest of those encyclopedias. Dragons flight 00:34, Jun 18, 2005 (UTC)

No. I would strongly oppose. We have to check each entry, particularly on '04, to make sure that it reflects the right article Yaka is a good example. Britannica is about an ethnic group, while the Wikipedia entry is about a minor Star Trek character. A bot will not be able to check this. I would not, however, mind if this was done on Nuttall. (Danny didn't sign)

Well let's be clear what we are talking about. This morning (i.e. ~10 hours ago my local time) I ran the script passively over your encyclopedia pages to see how many entries were involved. Summarized as follows:
Encyclopedia Entries Active
Nuttall 4166 236
Britannica 1911 9061 1104
Britannica 2004 17430 382
"General Encyclopedic" 48376 2827
Total 79033 4549
However, I did notice that Danny took these comments as something of a challenge so some of the lists have presumably come down since then. If you are really opening each active link and comparing to the content under the same topic in Brittanica, etc., then this probably takes at least two or three minutes per link, and so you would have about 150-200+ man hours of work to check them all. If you and others want to check them all, then be my guest, I'll go away. I'm simply offering to remove the 6% of links that have already become active, if you would like me to, because I have a tool that allows me to do it far faster than an unaided human could. Dragons flight 05:07, Jun 19, 2005 (UTC)
I agree - we really do need to check these by hand. I'd also like to see us check by hand all of the lists (including the African and Middle-Eastern topics one, where this was already run without consensus) - it's not that hard to do, and will stop us missing things needlessly. Ambi 06:11, 18 Jun 2005 (UTC)
If you and others really want to volunteer to check them, then be my guest. See my comments to Danny about the scope of that effort. With respect to the African and Middle-Eastern pages, I apologize if I stepped on anyone's toes. Since they look like lists of requested articles, contain no special instructions (or links to this project), and are linked from various requested articles subpages (to be fair I added some of those links), then I hope you can understand why I chose to treat them as requested articles pages, i.e. exactly what my script was designed for. The existence of this script and its use on requested articles page was discussed and accepted at Wikipedia talk:bots but I am happy to discuss it with others. After a week of running it across WP:RA, there are no ongoing objections (save the one you just made), which I would regard as implicit acceptance of its existence even if it is not an explicit endorsement. If there exists a consensus to treat the Africa and Middle Eastern lists differently, then I would be happy to remove them from the set of pages I process, but I would like to see some people speak up and say they intend to maintain them. My form of processing can't be worse than simply neglecting them, and I would point out that Africa had not been pruned in 6 weeks and that was done by the frequent RA contributor DMG413 who almost certainly didn't do any entry checking either. If you just create lists and then don't maintain them, I would expect that to discourage participation, as people don't see them as being a serious endeavor. Dragons flight 05:07, Jun 19, 2005 (UTC)

Sorry, but I am not convinced. I think we should be more careful about what we purge, lest we purge copyright violations (there are some marked in the lists), inadequate redirects, and others. It is worth the time and extra effort to do a good job. I don't believe a bot is adequate. And by the way, I have pruned the African list, and checked when I did it. Danny 05:23, 19 Jun 2005 (UTC)

To be fair it's not a true bot, since I read and approve each edit (really an editing tool rather than a bot), so I wouldn't remove anything actually marked as a copyvio, bad redirect, etc. Of course, many such presumably aren't marked. You also seem to be placing a great deal of importance on catching copyvios and the like right now (which are easy to miss even if you are looking). I tend to take an eventualist approach to this, and assume that once an article is created there will always be future eyes to look at them and catch mistakes. In fact, I would regard an editor interested in the topic as far more likely to catch problems. However, if the consensus is to check all the encyclopedia articles by hand I am willing to step aside and wish you luck. Though before I do that, I would like to hear a few more voices on this matter. Right now the comments on this thread are fairly evenly divided, and for a decision that effectively creates more work for the project, I would think it is worth knowing that this is really what the majority wants. Dragons flight 05:47, Jun 19, 2005 (UTC)
First of all, this WikiProject has just been started last week. I'm pretty sure that capacity or man hours won't be a problem if it gains more momentum. WRT to the matter at hand, I'm afraid the eventualist approach is not the best one in this case — the very reason that these lists exist is to warrant that future eyes look at them and catch false redirects and other misfits. Furthermore, 'neglect them or prune them' is a false comparison.
Above all, we should not forget that these lists are not here to be pruned, but to point out that we still need a lot of articles. It is far less harmless for a blued link to stay for a while (and get the chance to be checked) than to be pruned by a bot as soon as something exists. — mark 07:51, 19 Jun 2005 (UTC)
Mark, I'd actually missed the point that this was a new project. It is probably a compliment to you guys that it doesn't look or feel new. Dragons flight 08:31, Jun 19, 2005 (UTC)
Actually the project has been going for at least a year - the project page is much younger. Pcb21| Pete 10:50, 19 Jun 2005 (UTC)
I meant the project page, of course; the lists are a lot older. But I think that this enterprise is likely to gain more contributors now that it's cast in the form of a WikiProject. — mark 16:23, 19 Jun 2005 (UTC)
I think checking the blue links by hand is a matter of due diligence, but I'd like to know what happened when the very first versions of the list were created. Is is true that the creator (Bogdangiusca?) first automatically removed all the entries from the EB index that already matched WP titles? If so, then the genie is out of the bottle — we never got the opportunity to match any of those by hand. David Brooks 23:51, 19 Jun 2005 (UTC)
Okay, okay, I give. Good luck to everyone doing them by hand. Dragons flight 02:44, Jun 20, 2005 (UTC)
PS. I would still like to resolve the details of the copyright question I asked below, if people have anything more they can contribute to that discussion.

Missing articles from fr.wikipedia.org

Inspired by a similar list on the Dutch wikipedia, I've created a list with articles that exist in a large(ish) number of languages, but apparently not in English. I've based this list on a database dump of the French wikipedia. There are many false hits, so I don't know yet if the list can be used to raise the quality of the English wikipedia, but at least we can add a lot of links to the French wikipedia.

I can also make this list for other wikipedias. Should I? Eugene van der Pijll 20:48, 17 Jun 2005 (UTC)

Where can we find that list? — mark 02:01, 18 Jun 2005 (UTC)
Wikipedia:WikiProject Missing encyclopedic articles/frmark 02:03, 18 Jun 2005 (UTC)
I don't suppose there's any way of getting an English translation of the articles needed? Ambi 06:11, 18 Jun 2005 (UTC)
I'll work on a few of those, as my time permits.Mamawrites 04:55, 5 August 2005 (UTC)

Improve Wikipedia's credibility?

It would be useful to also be able to see a listing of Wikipedia articles whose topics appear in no other encyclopaedia. Some of my work on WP has to been to rehabilitate topics dropped by the current Britannica but which are of importance to historians and biographers. How easy would that be, please? Apwoolrich 11:41, 18 Jun 2005 (UTC)

Yeah, I've been thinking about that, too. It's an interesting idea. We should be aware that it would first of all expose our fancruft bulge — but there are lots of perfectly legitimate and important articles that Wikipedia has, but that others don't. Lord's Resistance Army comes to mind — I just checked Britannica 2005 and all they have is a paragraph or two in their 'Year in Review: 2003: Africa' and 'Year in Review: 2003: Uganda' articles. — mark 12:17, 18 Jun 2005 (UTC)
Problem is that listing will be 20 times the size of the 2004 listing. Not easy to analyse. Pcb21| Pete 15:59, 18 Jun 2005 (UTC)
How about featured articles that don't exist in Britannica. I doubt they wrote about exploding whales or "All your base".  Grue  16:18, 18 Jun 2005 (UTC)
Good idea. Maybe other might be added, such as Engrish. Apwoolrich 18:22, 18 Jun 2005 (UTC)
Bread clip, Rubber duck, ... (not featured articles, perhaps, but the best treatments of the subject in any encyclopedia, I'm sure). But I don't know if listing these articles will improve wikipedia's credibility with everyone.
Talking about Britannica and credibility: here's an article I came across while creating redirects for this project: Code of Kalantiaw; compare this with the EB article... Eugene van der Pijll 18:39, 18 Jun 2005 (UTC)
Now that's funny. But I think a cleanup tag would be in order. (gotta run; grandson arrived...) David Brooks 19:00, 18 Jun 2005 (UTC)
Oh, that's brilliant. If the Wikipedia article is true, where is that page poking fun at Britannica's errors again? Ambi 03:49, 19 Jun 2005 (UTC)
meta:Making fun of Britannica. The perfect candidate by the looks of it. Pcb21| Pete 10:30, 19 Jun 2005 (UTC)
Except some goon as boringified the name and moved it to en:wp - see Wikipedia:Errors in the Encyclopædia Britannica that have been corrected in Wikipedia. Pcb21| Pete 10:32, 19 Jun 2005 (UTC)

Whee! If you have access to EB, I really can recommend looking up some articles in your field of experience and comparing them with their Wikipedia equivalents that you know are good. I've compared some of my own articles with Brittanica and found enough errors to start a new section ("Languages and linguistics") on the 'Errors in the Encyclopaedia Britannic...' page. I furthermore discovered a particularly grave error in their coverage of Qala'un Mosque, which I created last week for this project, to blue the missing link. Check this out. — mark 17:20, 23 Jun 2005 (UTC)

New Todo Project

Well, Nuttall has been pruned. It would be good if we had another simple project for the Todo square, along with filling in the blanks. One that I would propose is transferring all the computer viruses on the General list to the List of computer viruses page. Some have been done, some are still embedded in the lists, and some are located at the bottom of the page. How about it? And please think of other, similar projects for when we finish that. Oh, and how about those Q's in 2004. Nice work. Danny 05:16, 19 Jun 2005 (UTC)

Another List

How about adding this to our goals to help Canada: Wikipedia:Canadian wikipedians' notice board/Dictionary of Canadian Biography. Danny 05:18, 19 Jun 2005 (UTC)

See also Wikipedia:Australian wikipedians' notice board/Complete to-do/Australian Dictionary of Biography. Ambi 08:11, 19 Jun 2005 (UTC)
Thanks for pointing that one out, Ambi. I have added both to our list. I wonder if we can't find other biographical dictionaries to glean information from. Danny 13:43, 19 Jun 2005 (UTC)

Copyright?

I hate to create problems for people doing good work, but has there ever been any serious discussion of the copyright implications of creating lists based on the indices of copyright protected work? Dragons flight 10:27, Jun 19, 2005 (UTC)

There was a discussion on the WikiEN mailing list about a similar list of topics from the Columbia encyclopedia, and I think the consensus was that it's OK. See especially Jimbo's reaction. Eugene van der Pijll 11:18, 19 Jun 2005 (UTC)
I assume that Jimbo either didn't read or didn't properly consider the applicable legal decision, in Feist v. Rural, which said in the implications section that a work "cannot contain any of the 'expressive' content added by the source author. That includes not only the author's own comments, but also his choice of which facts to cover". It seems clear that an encyclopedia does choose which articles to include and Feist v. Rural noted that selection was copyrightable. Hence, this remains a copyright infringment. To seek to avoid this, it's possible to try to determine which articles encyclopedias in general include, via a survey of many of them. That might (or might not) be successful - my guess is not, because it still results in direct competition with works of an identical nature and a clear derivative work (derived from several encyclopedias). I don'tsee much prospect of a fair use defence succeeding in the US and of course outside the US, that fair use option doesn't exist at all. Jamesday 23:51, 23 Jun 2005 (UTC)
If you are familiar with this, could you perhaps point to where Columbia encyclopedia is discussed? The link you gave basically says: "Didn't we settle this before, now let's talk about biography stubs". Incidentally, it does not inspire confidence that the list of biography stubs being discussed was subsequently taken off the internet. Dragons flight 11:27, Jun 19, 2005 (UTC)
The discussion about the Columbia topics was in the same month. Search for "Columbia" in the mailing list index; unfortunately, the discussion is a bit fragmented, as not all mails are in the same thread. Eugene van der Pijll 11:32, 19 Jun 2005 (UTC)
http://mail.wikipedia.org/pipermail/wikien-l/2004-March/011449.html Jimbo writes: "This is not problematic in the least". Pcb21| Pete 11:42, 19 Jun 2005 (UTC)

Wikipedia:Columbia Encyclopedia article titles was deleted because of copyright concerns Wikipedia:Goings-on/February_29,_2004. Dragons flight 12:09, Jun 19, 2005 (UTC)

And it was later decided that this isn't an issue - or at least that any risk was outweighed by the benefits. I just wish that that was still in the database so I could undelete it - was that the actual location? Ambi 12:23, 19 Jun 2005 (UTC)
I think so, it is what is mentioned on the mailing list and there are still a few links to it on "What links here". When you say "later decided", can you point to who/when/where? Dragons flight 12:34, Jun 19, 2005 (UTC)
Um, no, it wasn't decided that it wasn't a risk - it's clear copyright infringment. Some people didn't like that, and it's undoubedly inconvenient, but that's not the same as not being a problem. Jamesday 23:51, 23 Jun 2005 (UTC)
Yes it was that page, probably was a "database deletion" - I recall that the most vocal proponent of deleting the pages also has/had full access to the db so that potentially explains it. I know at least one of other user decided not to take part in the missing topics project because his view about copyright did not coincide with others working on the project. It'd be a shame if Dragons flight does the same, but of course it is his decision and I am sure we would all respect that. Pcb21| Pete 12:36, 19 Jun 2005 (UTC)
I did not have full access to the database at that time. I don't know who deleted it. It seems most likely that it was just normally deleted, though. Around June of that year, about a month before I acquired full access, there was a database problem and all normally deleted articles were completely removed, because they weren't backed up and couldn't be restored. Today, the deleted articles are backed up along with the rest, though they are held in a private backup, not the public dumps. I think it was me who added them to the list of tables to be backed up, so there wouldn't be a repeat of that loss. Jamesday 01:52, 24 Jun 2005 (UTC)
Thank you very much for the explanation. What I wrote above about the deletion was a bit of speculation that, as your reply shows, I should've best kept to myself. Apologies. Pcb21| Pete 07:32, 24 Jun 2005 (UTC)
I dug the copyright discussion out of the archive [1]. Look under March 2. Dragons flight 13:28, Jun 19, 2005 (UTC)

I really would like to see evidence of how it is okay now, given that it apparently wasn't a year ago March. In my own professional life, I have been confronted with a very similar situation and we ended up scrubbing a project because the risk of copyright liability was judged to be too high, based on arguments very similar to those raised in opposition on the mailing list and during the copyright discussion linked above. Those arguments seem right to me, and if they aren't I would like to understand why not. Or in the alternative, if the community (or perhaps the Wikimedia foundation and Jimbo specifically) have decided to accept the risk that these pages might be a copyright violation but intend to proceed anyway, then I would like to see an acknowledgment/discussion of that which comes more recently than the deletion of the Columbia article list. I would like to imagine that these lists are not simply an act of quiet opposition by people disinclined to consider the copyright implications. Dragons flight 18:15, Jun 19, 2005 (UTC)

This is all paranoia. I am no legal expert, but it seems to me that there are a range of reasons why this is not a copyvio, not least the obvious fact that they Britannica seem to agree with me, as they havent done anything about it. Even if it is, whats the absolute worst that can happen? they ask us to take it down. Bluemoose 08:00, 20 Jun 2005 (UTC)
Do you have a link to show that Britannica agrees with your assertion that you can use their content without violating copyright? I may be coming in late and don't see that reference. In what way haven't they done anything about it? Since we deleted the copyrighted content it seems they need do nothing to enforce their copyright. - Tεxτurε 15:04, 22 Jun 2005 (UTC)
Britannica is at liberty to choose the most damaging posible time to bring any infringment action. That's probably not today. This presumes that they are currently aware of it, which is far from certain. If you'd like their opinion, I suggest asking them. Best read Feist v. Rural first though, so you know the reasoning which made an unselected list OK in the US. In Europe, database copyright seems to apply and there's not a fair use exception for that. Jamesday 23:51, 23 Jun 2005 (UTC)
The most damaging time would be as soon as possible, why on earth would they wait? And yes, I was assuming that they are aware of the project. What are you (Texture) talking about when you say Since we deleted the copyrighted content it seems they need do nothing to enforce their copyright? thanks Bluemoose 08:22, 24 Jun 2005 (UTC)
My apology. Someone pointed out to me that I was addressing the wrong issue. - Tεxτurε 5 July 2005 21:32 (UTC)
In regard to the Feist v rural case, in which Feist copied information from Rural's telephone listings, we are not copying any information. Lists were generated of missing topics, these lists are of generic names of places/people etc.. hardly creative. Bluemoose 08:42, 24 Jun 2005 (UTC)
Feist v. Rural laid out guidelines for when a list of facts can itself qualify for copyright, even though the generic names/facts that comprise the list are not individually protected. The relevant provision is that such a list is protected if the selection of items on it is itself a creative act and non-obvious. (Rural failed the test since selecting all phones in their geographic area was considered obvious.) In my opinion, choosing what items to include in an encyclopedia does require creativity. If you accept that, then the list of articles in EB is itself subject to copyright. Hence one would require permission to copy any substantial portion of their index or to create derivative works from it (i.e. by removing those terms which exist in our encyclopedia). Any fair use claims that might have existed are pretty much torpedoed by the fact we are a direct competitor with them. I would argue that the most damaging time is after we have worked through much of the list and improved our encyclopedia with it, so their resulting lawsuit can claim not only willful copyright infringment but also damages to their business through lost revenue as we co-opt their subscribers by intentionally duplicating their topic list. How successful such a suit would be, I don't know, but the worst-case scenario is certainly not a simple take-down notice. As direct competitors, the statutory copyright claims alone can run in the 100s of thousands of dollars. Dragons flight 14:20, Jun 24, 2005 (UTC)
I would add 3 things 1)It is not a list of facts, but names of topics. 2) In the feist v rural case, it specifically says alphabetic lists are excluded as they are too obvious. 3) The lists are not of all Britannica topics, just a selection (28,000 of 100,000 possible). Plus, as i said above, the lists were (i am 90% sure) generated, as opposed to copied.
Also as i said above i am far from a legal expert, so why dont we just ask jimbo, he is the one who would have to deal with the potential problems. thanks Bluemoose 15:09, 24 Jun 2005 (UTC)
In Feist, it only requires selection or arrangment to be creative (Rural failed on both grounds). I agree that alphabetical is an obvious arrangment, but that doesn't impinge on the claim that the selection of articles should be protected. In the past, Jimbo has explicitly defered to community judgment on copyright issues, but I would be happy to summarize this issue and send a message to the mailing list (WikiEN-l) on which Jimbo participates, if people want to do that at this point. Dragons flight 15:26, Jun 24, 2005 (UTC)
Well, is our selection the same as the EB's selection? Haven't we added a criterion: "not in Wikipedia"? (oh, and I am no legal expert either -- just wanted to point it out) -- Avocado 20:31, 2005 Jun 24 (UTC)
  • If you are concerned about specific list of contents of specific encyclopedia breaking copyright, one possible solution could be to combine the that kind of lists into one (although it would be large and possibly unwieldy list) - Skysmith 08:33, 24 Jun 2005 (UTC)
I think the list is safe. I believe the only way in which we could infringe any rights Britannica had to the selection of article topics itself would be to create an encyclopedia that only included the same topics theirs does, because the selection of topics was a creative process in the creation of the encyclopedia. The list is basically irrelevant, because if we identically copied their selection just by reading their encyclopedia without making a list, we would still be liable. Instead, we just made a catalog of the contents of their work to ensure we don't miss anything important in our own selection of topics. Our resulting encyclopedia is so much broader than theirs that their editorial decisions do not represent the main substance of our final topic selection, so I don't think we can be liable. Compilation/database protection is rather narrow.
On the other issue above, about whether Britannica can delay suing for infringement however long it likes, this isn't true. A copyright holder can be barred from suing if it unreasonably delays in asserting its rights after it has or should have knowledge of infringing acts—my understanding is that the copyright owner's inaction basically creates an implied license. As to how long is an unreasonable delay, that's always a case-by-case determination. Postdlf 23:04, 24 Jun 2005 (UTC)

I sent an email to WikiEN-l in regards to this topic. If you aren't on the list yourself, you can follow the discussion here: [2]. Of course anyone who wants is also free to join: [3] Dragons flight 18:48, Jun 25, 2005 (UTC)

thanks. Bluemoose 10:07, 26 Jun 2005 (UTC)
After reading the inconclusive threads, I'm concerned enough to have decided not to elevate my risk any more. I'll switch to the Nuttall lists, using Gutenberg as a source. If anyone else wants to do the low-hanging-fruit thing, my PERL script is at User:DavidBrooks/wikimatch. Use it at your own risk. David Brooks 28 June 2005 16:57 (UTC)
How are we actually going to get a conclusion on this? As no individual (other than jimbo i suppose) could make a judgement, would it have to be decided by a consensus vote? If it was then I find it highly unlikely a concensus to remove the lists would be reached. Maybe we should draw up a list of overall for/against arguements. Bluemoose 29 June 2005 11:30 (UTC)
Is maintaining the status quo not sufficient? That is the outcome of a large majority of mailing list discussions. Pcb21| Pete 29 June 2005 12:18 (UTC)
How about asking the copyright holders? — mark 29 June 2005 12:21 (UTC)
Sorry Pete, I dont get your meaning?
(Mark) I wouldnt ask the copyright holders, i would tell them, and see if they complained. It could be a solution though. Bluemoose 29 June 2005 13:29 (UTC)

Chill ppl - let a possible DMCA take-down notice answer the question. Until then you better get busy writing! lots of issues | leave me a message 30 June 2005 13:25 (UTC)

I agree, given the continuing excellent work on Britannica AND Encarta I would like to regenerate the Columbia encyclopedia list but do not wish to do the work in vain. I think that the work on Encarta showed that a blue link does not neccesarily mean that there is even a correct or even complete article behind it. This of course means more trimming but I found it very interesting that the overlap between missing articles in Encarta and Britannica was less than 600 articles. If the 4,000 articles number is correct for Encarta, that means that they had 3,400 articles that Britannica did not. Per Martin - The non-blue non-moose request, I will not create this list for a few weeks.

The Nuttallian paradox

Nuttall is 73% complete because 73% of articles in Nuttall are in Wikipedia. EB2004 is 38% complete because 38% of articles that were missing at the start of the project have now been covered. Different measurements, but no big deal I suppose. It would interesting to know though what stage we are at in the EB project if we used the Nuttall measurement... to do this all we need to know is how many articles there are in EB2004 in total. Pcb21| Pete 10:50, 19 Jun 2005 (UTC)

The promotional material [4] says that EB 2005 has over 65,000 articles. Dragons flight 11:03, Jun 19, 2005 (UTC)
I think the list of EB topics came from a CD-ROM version, which has more than 100,000 articles, which would mean we are at 83% or higher. Eugene van der Pijll 11:09, 19 Jun 2005 (UTC)
Well, the "Deluxe 2005 CD-ROM" says 82,786 [5]. I can't find an article count for the Brittanica "Home Library", but that is presumably larger. Dragons flight 11:16, Jun 19, 2005 (UTC)
The 100,000 figure came from our Encyclopædia Britannica article, but I see now that that is for the DVD version. That was not entirely clear in our article. Eugene van der Pijll 11:21, 19 Jun 2005 (UTC)
Actually, the 100,000 figure (54,592,999 words) for EB2005 contains a lot of duplicates, as it breaks down into the following:
  • 73,570 - From Encyclopædia Britannica
  • 15,716 - From Britannica Student Encyclopedia
  • 1,851 - From Britannica Elementary Encyclopedia
  • 61 Britannica Classics
  • 9,090 from the Britannica Book of the Year (1993-2003)
The Student and Elementary Encyclopedias contain slimmed down versions of articles in the Encyclopædia Brittanica (termed Encyclopædia Britannica Library in the DVD 2005 version), so it's a little misleading to add them to the total. The best estimate would be 82660: all full articles plus the 'Book of the Year' articles. — mark 12:56, 19 Jun 2005 (UTC)
Not too shabby, thanks guys. Pcb21| Pete 11:12, 19 Jun 2005 (UTC)

A word of caution

Our goal is to create articles and content, not to smudge over content. That is why we should be careful when creating (and checking) redirects to make sure that they are to the same article. They should not be smudged into something related, just for the sake of knocking another one off the list. For instance, I was checking the Battle of Rio Salado and I came across a link to Nasrid Dynasty. You will see that it redirects to the modern town of Granada, which was once the seat of the Nasrids. While the two are interrelated, they are not the same. In fact, it is a lot like having George Washington redirect to the United States or Tudor dynasty redirecting to United Kingdom. EB has articles for both, and so should we. And this is not the only example I can think of ... On a similar note, and I am guilty of this too, the literary characters in Nuttall are being redirected to their books. For instance, I redirected Uncle Toby to The Life and Opinions of Tristram Shandy, Gentleman. Upon reflection, if we can have articles for every minor character in the Lord of the Rings or Star Wars, we could afford to do some major literary figures as well. Danny 18:25, 19 Jun 2005 (UTC)

On redirecting to "supra-topics". Yes we must be careful to resist the temptation to smudge merely to bump up percentages. However I think such redirects can be acceptable - Often the other encyclopedias tolerate much shorter articles than we do. Here we think of stubs as being short-term stop-gaps only. Thus as long as the parent topic clear identifies the sub-topic (so that a user is never bewildered by a redirect) and makes a good stab at describing the subtopic, they can be acceptable.
Bingo. Actually I covered exactly that point in my first draft of the howto. The mindset should be obvious: imagine the man in the street typing some title into the Go box, and being redirected to a different title. It should be immediately clear that he has found what he was looking for (actually, this is a high standard that many existing redirects do not adhere to). If it can't be made clear, there are other approaches: refactor the existing article, for example, or write a stub-sized but complete article with a seemain. In the end, the result should be an editorially consistent encyclopedia, and the fact there is a list of EB topics should disappear. David Brooks 03:54, 20 Jun 2005 (UTC)
What about redirecting to sections? Acceptable? I have done it quite a bit but am thinking about stopping because the software seems to have off-again, on-again approach to making them work correctly! Alternatively I could write a stub -with a seemain#subtopic as you suggest but am afraid that people unrelated to this project might vfd them as these little pages, essentially navigation pages, are not the way we typically do things. Pcb21| Pete 09:56, 20 Jun 2005 (UTC)
Right now, section redirect seems not to be working, and ISTR a how-to that said it doesn't. If it were, I'd still apply the man-in-the-street test. I think it would be OK if the subsection's title is essentially the same as the original query, otherwise it would be more likely to cause confusion. If you are in the middle of a page, you presumably don't see the "(Redirected from..." hint. I say presumably because, if it were fixed, I suppose there's a possibility the hint could be tacked onto the section title. But that's all moot.
On the vulnerable short pages, I think that's a problem caused by the pseudo-standard "every page must be linked to". Best I can suggest is to make the summary tag reference this project. David Brooks 18:40, 20 Jun 2005 (UTC)
I think that this project is reaching the maturity such that a "don't (immediately) remove your own blue links" guideline wouldn't be too burdensome. It will increase the chances of getting two opinions on each title. Pcb21| Pete 18:59, 19 Jun 2005 (UTC)

Two ideas

1) I think we need a place to dump Nuttall entries that are innappropriate for wikipedia, even for redirects. E.g. Bulls and Bears refers to Bull markets and Bear markets, no one would ever search for "Bulls and Bears" its just plain silly. there are plenty of others that i think would never be more than dic defs as well, although thats more debatable. The page would be a link from the Nuttall page such as Wikipedia:Nuttall Encyclopedia topics/innappropriate, and would need a quick description of why it is innappropriate. your thoughts....

2) How about a focus article, where we could find a page that another encyclopedia has, but Wiki has no equivalent, then work on it and hopefully bring close to featured article quality. We could have a new one every week, or 2 weeks if it moves a bit slower. I know this may mean people might spend less time making new redirects/stubs, but i think this project has no specific focus on quality at the moment, its all about quantity, so it would be a good addition.

The article would have to be;

  • Non existent in wikipedia (obviously).
  • Potentially a really good article.
  • Researchable on the internet.
  • Generally popular.

What do you think?

Bluemoose 18:47, 19 Jun 2005 (UTC)

Hi. If you think the article is absolutely inappropriate, like Bulls and Bears, then delete it from the list. I have deleted quite a few slang terms already from the general list--how many different redirects do we really need for vomit? I like the second idea a lot, and suggest we start with Nasrid Dynasty, about which I explained above. Nothing wrong with a little book research either. Danny 18:49, 19 Jun 2005 (UTC)
Main article Nasrid with Nasrid Dynasty a redirect, or vice versa? As it happens I created Antequera early in the project, and I have some nice pictures of Granada, so I guess I should put it on my todo list. David Brooks
I tend to lean on the side I've being extremely cautious in deleting articles without "blue-ing" them. For example, you say "no one would ever search for bulls and bears". However to my mind that sounds like a very natural thing to for a savvy searcher to search for - by including both "bulls" and "bears" in my search I am much more likely to get results about the markets than about animals. And this applied in general - what you think is trash others might not - so be cautious, it doesn't matter if the lists stay longer than they should for just a little while, after all no one is offering us a prize for finishing by Christmas.
As for the second idea. A tough task indeed but heck that doesn't tend to stop people attracted to this project! Let's give it a whirl. Pcb21| Pete 19:08, 19 Jun 2005 (UTC)
I added a bit to the project page about having a focus article. Feel free to edit it mercilessly. Bluemoose 07:50, 20 Jun 2005 (UTC)
I see the Nasrid dynasty has been named as a focus article. I think the focus on the dynastic line misses something about the Nasrids. What's missing is their effect on the region's art (including pottery and architecture) and politics. They aren't mentioned by name in al-Andalus, which is itself the subject of a dispute; they're not mentioned in Alhambra either. EB1911 The Nasrides is about the politics. Proposal: a Nasrid main article that outlines the events and effects of the period, with the dynasty listing as a stand-alone see-also. Otherwise we could outline the culture in the existing article, but I'm not convinced it's the right title. Unhelpfully, most mentions in existing EB articles are references to specific members of the dynasty. Sorry this is more talk than editing: a new job is taking much of my oxygen. David Brooks 15:04, 22 Jun 2005 (UTC)

Nuttall

The thing is, Nuttall sucks in terms of information. But, with a little work, you can really make something good out of its little stubs. I am proud of this one Nation of shopkeepers. When you come across something short, add to it. It can get a lot better. Danny 04:18, 21 Jun 2005 (UTC)

Cool little article, thanks Danny. Those fun little articles need to get linked to, maybe as see also from Napoleon, Wealth of Nations and Psychology of the British? Pcb21| Pete 07:07, 21 Jun 2005 (UTC)

Marwell

Hi

I was looking through the EB page 18 to do list, and noticed Marwell Zoological Park was on there. There was already a page for Marwell Zoo so I moved the text from that one to the Park name (since that is the official name) and redirected the Marwell Zoo page.

Do I take the link off the list of missing pages now? Do I need to do anything else? Thanks MyNameIsClare talk 11:00, 21 Jun 2005 (UTC)

Hi Clare. Thanks so much for joining in the project. You are welcome to remove the link from the list if you want to (many users find making one item shorter gives a little sense of satisfaction). However if you want to leave it for someone else to remove that's even better - it gives someone a chance to see what people have been working on, maybe check your work is ok etc.
Just so you know for the future: The best wave to move a page is to use the "move" button on the row of tabs at the top of page. This is better than doing a "copy and paste move" because it preserves the history of everyone who has contributed to the article (see the "history" tab). Pcb21| Pete 11:48, 21 Jun 2005 (UTC)
Thanks Pete, I didn't see the move button there but I will use it in future MyNameIsClare talk 11:53, 21 Jun 2005 (UTC)

Nuttall Friday

This project is really taking off. I want to propose that this Friday (June 24) be declared Nuttall Friday. On this day, we make a concerted effort to do real damage to the Nuttall lists. Danny 02:24, 22 Jun 2005 (UTC)

Would anyone have objection to me merging all 25 pages onto 4 larger pages, reason being that then i could easily add some useful searches, as i find it very tedious going at the moment. Bluemoose 08:50, 22 Jun 2005 (UTC)
No objections here. Pending some unforeseen occurrence, I'll dedicate the whole day to Nuttall articles - though I might need some reminding. Ambi 15:18, 22 Jun 2005 (UTC)
Danny, how about handing out sections to the regulars? (those of us who can put in some work Friday, anyway — not sure if you can count on me yet). David Brooks
Best to pick and choose according to interests, I think. :) Ambi 16:30, 22 Jun 2005 (UTC)

Pick a number... Any number....

Anyone else notice that the % done for EB 1911 doesn't match in the project page and the info box? Which is correct? -- Avocado 23:42, 2005 Jun 25 (UTC)

Pcb21 seems to have fixed it: thanks! -- Avocado 00:00, 2005 Jun 26 (UTC)

Project scope

The main goal of this project is to ensure that Wikipedia has a corresponding article for every article in every other encyclopedia available.

It is tempting to add all sorts of missing article lists to this project (e.g. music articles), but that is outside it's scope. This is about articles that other encyclopedias have and we dont. The missing music articles for example could esily have its own project. Otherwise this project will turn into an "every missing article" project, which is basically what the whole of wikipedia is, if you understand my meaning. thanks Bluemoose 10:07, 26 Jun 2005 (UTC)

I know the project title may be slighly misleading, but "WikiProject Articles that other encyclopedias have and we dont" isnt as catchy.Bluemoose 10:08, 26 Jun 2005 (UTC)
Actually, I believe they come from a collection of encyclopedias specializing in music. Danny 10:20, 26 Jun 2005 (UTC)
Well it doesnt say that, even if it is, it is far too specific, imagine if every subject had a list like that, we would be totally swamped, and the project rendered useless. It would be much better for everyone if it had its own project, or at least just attached itself to an existing music related project. Bluemoose 13:23, 26 Jun 2005 (UTC)
It would fit in much better as a descendent of Wikipedia:WikiProject Music. Bluemoose 13:26, 26 Jun 2005 (UTC)
It could be a crossover list, relevant to both. Besides, if it draws more people in, why not. Besides, there are many articles listed there that we have. Danny 13:35, 26 Jun 2005 (UTC)
Bluemoose, the music articles are from a music encyclopedia. We would not be swamped because wiki is not paper, and because there are a finite number of subjects that have encyclopedias about them. The goals of the missing articles encyclopedia project are clear: The main goal of this project is to ensure that Wikipedia has a corresponding article for every article in every other encyclopedia available.. The list of topics I posted can all be found in the existing published encyclopedias of music. Gmaxwell 17:04, 26 Jun 2005 (UTC)
I agree with this: I think the list of music articles belongs. And if someone wants to add another list of articles from encyclopedias of philosophy or religion or art history, by all means do. Those more specialized encyclopedias capture some things the general paper encyclopedias do not, and this is precisely the goal of Wikipedia. Cheers, Antandrus (talk) 17:09, 26 Jun 2005 (UTC)
A note on this, I am currently working on building lists from a number of other notable specialist encyclopedias in other fields. So your wish will be granted. It's a laudable goal to make wikipedia match EB, and we've made a lot of progress.. But wikipedia's fame and value comes from not only replicating EB, but from catching a lot of what they miss. The input from specialist encyclopedias will help us a lot.
Which encyclopedia? Please specify - we should be open. If this is from an electronic copy of Grove's index (it smells like Grove, but I realize that has only about 30,000 articles), then the current EB copyright discussion is applicable.
Second point: if this goes ahead, we should rope in the folks on Wikipedia:WikiProject Classical music. David Brooks 18:49, 26 Jun 2005 (UTC)
This project was set up to focus on the EB2004 /1911 and general topics lists, nuttall was an obvious addition, please do not hijack it, otherwise it will just become a list of every missing article on wikipedia, which is just stupid, why dont we then include the requested articles and every other one of the hundreds of lists with missing articles?. This will just dilute the focus of the project and render it useless.
Missing music would be much better suited as a separate descendant project of this one and the wikiproject on music. Bluemoose 18:58, 26 Jun 2005 (UTC)
I have made music into a descendant project now, of this project and the general music project, this will be better for all concerned, as music editors will be attracted to it now. Bluemoose 19:12, 26 Jun 2005 (UTC)
Bluemoose, this list isn't just a list of topics. It's a list from an encyclopedia. It fits the charter perfectly and it will help draw more editors to the missing encyclopedic article project. There is no need for the work on the missing music articles to be limited to music specialists, a majority of the articles are biographical and can be written by people without any special interest in music. There is also a fair degree of overlap with the missing EB articles. I'm a bit put off in that you've failed to respond directly to any of my comments and persist in just switching things around in a vaccume. I'd be glad to discuss this but communication must go both ways. --Gmaxwell 19:49, 26 Jun 2005 (UTC)
What? I am actually having to bite my tongue a lot here, you have had (as far as i can tell) no interaction in this project, then you jump in changing things with no consultation, when there is opposition, you just revert back to how you want it, this is incredibly annoying.
I made a single reversion, but only after I realized you had been working under the impression that the music material was just a list of topics and not existing encyclopedia articles. I even left you a note on your talk page explaining it. I only made the articles an addition to the project after extensive discussion with Danny, who has been very active in the project. You have not replied to my statement that they are encyclopedia articles and thus belong, either on your talk page or on here. --Gmaxwell 21:30, 26 Jun 2005 (UTC)
Please do not patronise me and tell me what this project is and isnt. When I made this project a couple of weeks ago, it was for one reason. To focus efforts and advertise the 2004/1911 and general encyclopedia topics, with nuttall as an obvious addition. It was not to make a list of all missing topics. That would be very self defeating to the point of the project. I realise that your topics are from an encyclopedia but it is out of the scope of this project. Hence i made your lists into its own project, linked to from here and the Wikipedia:WikiProject Music. I know anyone can make a music article, that can be said of anything, but that is beside the point, it is not a general topic. Your music lists will be much better off now as people interested in music have a much greater propensity to make articles on it than the wiki population in general.
The list of EB article has existed and been worked on for months. I am not making any effort to patronize you, and I'm sorry that I came across that way. I agree that it's fine to move the music stuff to another part of the list, but I agree because the change is inconsiquential... not because I see any difference between the EB topics and the music encyclopedia topics. Why difference does it make? --Gmaxwell 21:30, 26 Jun 2005 (UTC)
Anyway, I think we have come to a conclusion beneficial to all, however, i suggest you make a template for the music articles, so you can add more information etc. as is clearly not going to be possible to add much more to the template for this project. Bluemoose 20:58, 26 Jun 2005 (UTC)
I'm sorry, I don't see any we speaking here. You have made a decision. I think your current position is acceptable, but I'd like to understand it a bit better. --Gmaxwell 21:30, 26 Jun 2005 (UTC)
P.s. I think having descendant projects is a very good idea, and should be encouraged for other subjects. Bluemoose 21:01, 26 Jun 2005 (UTC)
Could you explain how it is differs at all? --Gmaxwell 21:30, 26 Jun 2005 (UTC)
(outdent for sanity) - can't we reach consensus by putting a suitable note on the project page. The primary and original purpose of the project was to match the scope of another major general-purpose encyclopedia, and newcomers are encouraged to help that effort in particular, but there are other similar projects that may attract a special interest.
I'd be happy with any solution that leaves the music articles clearly linked from the project page as the list of topics is clearly a member of the same family of resources. I'd also like to see it kept in the infobox, but if that isn't good for space reasons then perhaps we should just omit it until nutall is complete then add it then. Gmaxwell 22:57, 26 Jun 2005 (UTC)
And, GM, I ask again: which music encyclopedia? By not identifying it you're beginning to sound coy. David Brooks 22:33, 26 Jun 2005 (UTC)
Sorry DavidBrooks, I missed your earlier comment entirely because of lack of indent. :) I'm not trying to be coy at all. The list I uploaded contains topics found in many music encyclopedias, but is not a complete duplication of any of their indexes. Although there are no entries which can not be found in one of the notable music encyclopedias, there are still some articles in those sources which will not be found in my list (i.e. it is mostly but not entirely complete). The list differs substantially from the indexes in those sources because I made a strong effort to make it follow the wikipedia naming conventions (although I did make a number of mistakes). If it is impacted by any decisions related to the EB article at all, it should be far less impacted. Gmaxwell 22:57, 26 Jun 2005 (UTC)
Fair enough. As you describe it, a lot of work has gone into it. I think it becomes a less compelling project than EB, because we can characterize EB as "coverage of topics that are notable because a respected work says they are". So using Grove as a standard, for example, would have given the project a harder edge (but would court the copyright questions). And, again, I suggest you go to the Classical Music wikiproject to advertise. David Brooks 23:34, 26 Jun 2005 (UTC)
GM, projects on wikipedia always have a specific task/goal, that is why there are many sub projects of the music project for example. I just want to maintain the specific goal of this project, and not broaden it too much as it would be easy to do. Simple as that. Bluemoose 22:42, 26 Jun 2005 (UTC)
No argument there, but what I see here is project which grooms togeather several encyclopedias, each which have their own project pages. I don't see how adding an additional encyclopedia broadens the goal at all... In any case, it's all irrelvent because the project doesn't actually accomplish anything: the work is in creating the articles ... a goal that all this debate does nothing to further. Gmaxwell 22:57, 26 Jun 2005 (UTC)
I hope I'm not splitting hairs here, but if this: The main goal of this project is to ensure that Wikipedia has a corresponding article for every article in every other encyclopedia available remains the project goal, and the music articles are indeed from encyclopedias, then we are still within the scope of the project. But leaving aside potential hair-splitting, whether they are exactly in this project or just linked from it, it's a great list and I'm already using it, so thanks GM. Cheers, Antandrus (talk) 22:51, 26 Jun 2005 (UTC)

I have been working on these lists for a lot longer than most people, so I think I have a right to add my $0.02 here. The purpose of these lists, and the project in general, is to give wikipedia the most possible coverage, by ensuring that we have everything that EB, Nuttall, Grove, Encarta, Columbia, and other encyclopedias have. This includes specialist encyclopedias, like music, art, or Judaism (see below). Hell, I just order an historical dictionary of India off of E-Bay for precisely this purpose. I agree that most of our current efforts should be on 2004, however, it does not harm in the least to have the other encyclopedias listed. In fact, they will help the overall effort. I know that the General Encyclopedia, Nuttall, and EB have been helpful to each other. Why not add some more? Maybe they will attract people to the wider goals of this project--ensuring the most extensive WP coverage on all topics. Danny 02:22, 27 Jun 2005 (UTC)

I can see this is where the problem is; The main goal of this project is to ensure that Wikipedia has a corresponding article for every article in every other encyclopedia available - I implicitly meant general encyclopedias, such as Britannica, encarta etc. I will change that now, sorry for the confusion.
This is already the broadest and most ambitous project in wikipedia, if you believe having more lists/goals etc. will do no harm, then you do not understand what a project is for; to quote Wikipedia:WikiProject best practices a project should "Be as specific as possible". Adding more lists, while technically within the original remit ("every encyclopedia" as opsosed to "every general encyclopedia"), clearly it massively broadens the scope, and opens the flood gates to make this a page with lots of links on it rather than a project with a foucs.
Anyway, please let that be the end of it. I know you think i am being annoying, but i was pleased to see the music lists appear, and really think being a descendant project will be much better for it and for us (mainly for it though). thanks Bluemoose 28 June 2005 07:26 (UTC)
I agree with Danny. List whatever we can get our hands on, but concentrate on 2004. Exactly how it is laid out, via descendant projects or whatever, just seems like organisational noise. As long as we can find the lists we are golden. Pcb21| Pete 28 June 2005 07:34 (UTC)
p.s. Isn't it a bit ironic that Wikipedia:WikiProject best practices doesn't make sense? Pcb21| Pete 28 June 2005 08:16 (UTC)

Jewish Encyclopedia

Hello. As a participant in this project, you may have noticed that there a quite a few Jewish subjects in EB2004 waiting to be bluelinked. If you come across missing topics related to Jewish history, religion, culture, or biography, you may wish to consult and adapt the material from the public domain Jewish Encyclopedia when you are creating the corresponding articles for Wikipedia. Although the JE was published in 1901-6 and urgently requires updating in places, some of the historical research it contains (especially on Jewish life in the Middle Ages and the Enlightenment) cannot be found with such clarity or convenience anywhere else on the English-language internet.

To avoid duplication of effort, I would like to ask if anybody has already compiled a list of topics treated in the Jewish Encyclopedia in order to compare it with Wikipedia's coverage of Jewish subjects. Would there be an efficient way to create such a list without having to slowly copy the 15000 titles manually, only ten of which can be displayed at a time by the browse feature on the Jewish Encyclopedia website? --Defrosted 01:34, 27 Jun 2005 (UTC)

Hi, I know that in the past, RK has used that as a source. So have I. The information is rather dated, but it is certainly a good start. I don't know if there is a way, but I do think it would be really useful for us to have the list. I will be happy to work on it. 02:25, 27 Jun 2005 (UTC) (who happens to be a Jewish historian by day, Wikipedian by night--God, my life is a mess)
Hi, thanks for offering to help. Please let me know what you would like to do. When editing on wikipedia was disabled yesterday, I began pulling the article titles and previews out of the Jewish Encyclopedia directory. To my surprise, it took only around an hour to manually collect and wikiformat 200 listings and article previews, but the task does get tedious and tempting to procrastinate. Today, I uploaded the lists for L (first half), O, Q, U and X at Wikipedia:Jewish Encyclopedia topics. The rest of the Levys should be finished by tomorrow (maybe the next priority should be to do all the Cohens in C).--Defrosted 28 June 2005 10:39 (UTC)

3 for the price of 1

I made Boy bishop a while ago for Nuttall, then noticed it was on 1911 and 2004! Does anyone have any way of estimating how much overlap there is between the lists? Bluemoose 28 June 2005 07:56 (UTC)

The aborted merge project, see User:Pcb21/list_temp gives some data. There appears to be little overlap, because subjects so important that they got a mention in 1911 and 2004 tend to be so important that WP has already covered them. Kudos of finding one that didn't fit that pattern! Pcb21| Pete 28 June 2005 08:23 (UTC)

Just a Note

This is a valid project, but you should remember that an topic isn't more deserving of anarticle simply because it appears in another encyclopedia. Check out Wikipedia:Requested articles from time to time as well. They should all get articles too! Superm401 | Talk July 3, 2005 01:40 (UTC)

Yes, they should. Perhaps a project could be organized to help them along. One does not contradict the other. Danny 3 July 2005 23:07 (UTC)

Nuttall progress

Just a note to let you all know that Nuttall is progressing nicely. In fact, there is some overlap between the two lists, which is nice to see. Personally, I like smaller lists. Thta is why I broke up the lists alphabetically. It seems to be working too. On EB, we have already finished 3 letters and are making nice progress on a fourth, in addition to filling out the standard pages. That said, I would like to draw people's attention to Wikipedia:Nuttall Encyclopedia topics/1. Just thirteen to go out of a full page. Take a look, make a link, and enjoy. Danny 3 July 2005 23:07 (UTC)

P.S. Done!!!

Capitalization and other minor alterations

I have no time to really get into this project but I had a look at various lists (I have my own list I periodically check through preview). Sometimes there is only a difference of capitalization (Abyssal Zone in the List of Encyclopedia Topics and already existing abyssal zone, for example). Some encyclopedias use the plural form when WP often uses the singular (Basilian Monks and Basilian monk - I already fixed that).

If I remember correctly, somebody somewhere had a script that checked the database dump and still-red links for differences in capitalization and the like and created a list of red links and possibly connected existing articles. Could that be useful? - Skysmith 5 July 2005 10:24 (UTC)

You might be thinking of the "Red Links project" or whatever it is called that attempts to remove red links by doing redirects of that sort. Also there is David Brook's fruit-finder specific to this project (see this page and/or the EB2004 talk page). David has made the script available, but I don't think it has been applied to LoETs yet. Pcb21| Pete 5 July 2005 11:01 (UTC)
I have noticed somebody also has a bot that makes redirects for different diacritics. Bluemoose 5 July 2005 11:59 (UTC)
It might have been this - although it appears that Daniel Quinlan is in hiatus. Likewise Brion who wrote the script - Skysmith 6 July 2005 10:05 (UTC)

Off on vacation

I've been one of those burning through Nuttall, but tomorrow (Jul 6) I leave on vacation to England, mostly for family-duty stuff, and Normandy/Brittany for fun. Back on Aug 1. One thing you can sort out while I'm away: having two parallel lists means that important notations and warnings have to be added twice. Is there any way of fixing that? David Brooks 6 July 2005 03:18 (UTC)

Your family can go without you can't they? ; ) The reason i made the new nuttall lists was only because i wanted to add the external search links, which would have taken ages on small lists. It would be quite easy to copy the external linked red links from the new lists into the 25 old pages, if you knowm what i mean. then we would have one list with a letter per page (ok y and z can share) with the external links. Bluemoose 6 July 2005 08:58 (UTC)

Another Nuttall milestone

The list of K's in Nuttall has only 4 articles left, thanks to some help from the Polish Wikipedia, which added Kunowice. We can finish that easily~ Danny 9 July 2005 04:44 (UTC)

Some help would be appreciated in knocking out the rest of the A's in Nuttall. I think we can do it in less than a week. Danny 01:07, 10 July 2005 (UTC)

And we did! No more A's. Now, how about the E's? Danny 11:31, 14 July 2005 (UTC)

Focuses

All 4 big nuttall pages have had focus now (and we made great progress!), I personally think we should move on to 1911 now, if only for a few weeks, as magnus added some very useful links, and it obviously has a generally great public domain source (plus it overlaps with nuttall quite a bit anyway), so I think we will storm through it. What do you think?

p.s. I am up for adminship at the moment, if you care to vote! Wikipedia:Requests_for_adminship/Bluemoose

Bluemoose 22:24, 18 July 2005 (UTC)

I think it is a great idea, and I have done a few of the articles there {1911) recently (incl. two on the 2004 list yesterday). One thing about Nuttall though--I really think we can finish the whole thing in a month with a concerted effort. That would also be quite an achievement. Oh, and you got my vote. Danny 01:23, 19 July 2005 (UTC)
I've considered datadumping the Nuttall stuff to "finish" it in one go. Obviously everything dumped would be tagged with a new {{NuttallUnchecked}}<nowiki>, and also sectionized with something like :::''Wikipedia does not have material on X yet. Below is the corresponding entry from the Nuttall Encyclopedia which may give some idea about the topic from a historic perspective.'' ::Obviously we'd still have a big project, emptying out the new [[:Category:NuttallUnchecked]], but the thinking here is that a well-worded ''something'' created by bot will be better than nothing. Thoughts? ~~~~ ::P.S. You got my vote too. ~~~~
I can see the attraction of that idea, but I'm extremely dubious about it. Datadumps are rarely a good idea, especially because (1) there are clearly some articles in Nuttall that don't actually belong in WP (some should be wiktionary, some maybe not appear at all); (2) redlinks demonstrate holes in coverage better than unchecked stubs; (3) it's disingenuous (is that the right word?) to have a Wikipedia article that says "Wikipedia doesn't have an article on this"; (4) the natural inclination will be simply to wikify the (often inaccurate and out-of-date) Nuttall entries, rather than write a proper article drawing on other sources as well. I agree that we should be able to finish Nuttall the way we're going, so let's keep it as a focus rather than switch to 1911. There's no rush - we'll get there eventually - but I'd rather do it with properly crafted articles than with a datadump. OpenToppedBus - My Talk 08:56, July 19, 2005 (UTC)
I would have to agree with OpenToppedBus, attactive though the idea is. I would also add that i rarely actually use nuttall information, 90% of the time I get the information from 1911 as it is much better. p.s. thanks for nice votes/comments! Bluemoose 09:34, 19 July 2005 (UTC)
I also agree. I'd rather take the time and do something more serious and reliable than just data-dump material. Danny 10:17, 19 July 2005 (UTC)
It's funny. I have the same opinion about what we are trying to achieve as the the three of you. I specifically tried to word my idea so that it would take care of the concerns that you were sure to raise. But you all raised them anyway ;-). To be honest I think it is misplaced - we carry on doing something serious and reliable, we carry on as we were with our medium- and long-term goals. However I do believe a well-crafted template will be a short-term benefit that will a) give our readers some information now and b) make it easier to transform articles to a decent quality going forward. If you view Wikipedia as a work in progress that is incrementally-improving all the time I think this makes a lot of sense. Pcb21| Pete 10:54, 19 July 2005 (UTC)
Specific comments on OpenTopped specific points. Re 1) Agreed. Part of checking an article would be to transwiki/delete when appropriate. 2) Not sure if I agree or not - red links from where? 3) Seems a bit like semantics rather than a genuine problem. Just a question of getting the template wording right. 4) Clearly you are talking about wikify-ers unaware of this project. Again perhaps appropriate links in the boilerplate will minimize problems any such people will cause. Pcb21| Pete 10:54, 19 July 2005 (UTC)

This week

I have selected the first list from EB 2004. While we are doing quite well there informally, let's see how well we can do with a full frontal attack on the opening list. Can we reach 60 percent? 70 percent? I have also selected List 3 from Nuttall, i.e., the B's. This is the biggest list on Nuttall, with 248 articles--about 10 percent of the total remaining articles. I wonder if we should also consider adding a 1911 list too. One word of caution. I have just restored Gavin Hamilton to the list. He is a Scottish painter according to EB, not just a cricket player, as we currently have him. Last night I noted that Jean Leclerc is not just a minor television actor but also an important Belgian biblical scholar. I have encountered a few other examples of this. Please check all names before removing them from the list. Otherwise, we can lose valuable information. Better to be a little slow and more thorough than quick and miss some. Danny 10:59, 19 July 2005 (UTC)

Point well taken. Some names are quite common. There are now articles forJean Leclerc (theologian), Jean LeClerc (painter), Jean LeClerc (actor), and articles are awaited on Jean Leclerc (hockey player), and Jean Leclerc (singer). Bejnar 20:04, 10 September 2006 (UTC)

Another general encyclopedia

I'm working on preparing lists of missing articles from the Hutchinson Encyclopedia (acquired here). Progress so far at User:OpenToppedBus/HutchinsonA, User:OpenToppedBus/HutchinsonB and User:OpenToppedBus/HutchinsonL. Aiming to have this ready to work on by the end of the week. OpenToppedBus - My Talk 16:18, July 25, 2005 (UTC)

Note that of these, User:OpenToppedBus/HutchinsonB is the closest to being ready to go "live". Once I've got the others to that state, I'll re-alphabetize and split into sensible sized pages. OpenToppedBus - My Talk 16:29, July 25, 2005 (UTC)
Nice one, I'll add some quick links/searches to them if you want. Bluemoose 16:46, 25 July 2005 (UTC)
OK. Page now live at Wikipedia:Hutchinson Encyclopedia topics. I figured that they may as well all go on one page as there were only just over 1000. Please feel free to add searches and links. There's a lot of low-hanging fruit - for example, people including middle names who just need to be redirected. OpenToppedBus - My Talk 11:25, July 26, 2005 (UTC)
Thanks for this. I've just gone through and done quite a bit of the low-hanging fruit. Time to take a break for now, but I'll be back.
It's an interesting list. Probably at least 80% of it will be redirects, but some of what'll be left might not turn up in the other encyclopedias, given that Hutchinson has a bit of a UK focus - things like working men's club, Youth Training Scheme and Advisory, Conciliation and Arbitration Service. --OpenToppedBus - My Talk 14:02, July 26, 2005 (UTC)

Nuttall problem

We are making really good progress in the Nuttall encyclopedia. Unfortunately, however, the excitement about finishing it is leading to a few little problems. Nuttall is almost 100 years old. Its information about geographical locations is often even older. This can pose a problem, since it is tremendously out of date. For instance, when removing some of the Y's, I came across "a district in German East Africa." I am quite confident that it is not in German East Africa anymore. Please check places carefully, and do not just copy and paste Nuttall text there. It may take a bit longer, but at least we can be relatively accurate. Danny 01:22, 26 July 2005 (UTC)

And to think you rejected my plan to avoid any problems of this type! ;-). Pcb21| Pete 07:39, 26 July 2005 (UTC)
Your plan? If it works, I say go for it. Danny 11:04, 26 July 2005 (UTC)
Actually I meant my plan in ==Focuses== above. The idea is that material from 1911/Nuttall would be added to the 'pedia under a clear heading of "historical view" equipped with a disclaimer. Thus people are stopped from data-dumping without thought, and we are able to modernize the material as part of the ongoing project in a careful way. Pcb21| Pete 12:37, 26 July 2005 (UTC)

Different focuses

Magnus has created 1911xNuttall, 1911x2004xNuttall and 1911x2004 lists. I think we should only focus on these for a bit for a number of reasons; 1) they should be easy having multiple sources; 2) 2 or 3 for the price of 1 will be good; 3) the fact they are in more than one other encyclopedia means they are probably more high profile.

I have added one as a focus, but maybe we should do all three? Bluemoose 10:20, 26 July 2005 (UTC)

An observation and a suggestion

It seems to me that the small lists get done really quickly, while we plod through the longer ones. Look how well we did on the various small letters in 2004 (Q, U, X, Z). With that in mind, and given how much harder it is becoming to find links, I would like to propose that we focus on smaller amounts in upcoming collaborations of the week, for instance Section 3 of 2004's page 2. With a more specified concentration, we are sure to make a larger dent in good time. So, whaddya think? Danny 01:33, 29 July 2005 (UTC)

I think it is a mighty fine idea. I wonder how many encyclopedia's have articles on psychological crutch? Pcb21| Pete 07:50, 29 July 2005 (UTC)

To attempt this, I have divided page 4 into sections of 60 articles, breaking up the final section to distinguish between B articles and C articles. I then placed the focus on a particular section, 3, and removed the blue links. Only 44 left. If we finish that before Tuesday, someone can easily pick another section. Danny 11:38, 29 July 2005 (UTC)

Catholic Encyclopedia

In the spirit of the Jewish Encyclopedia project, I'm wondering if there shouldn't there be a Catholic Encyclopedia effort to include this work as part of the project. Similar to the intent of the Jewish project not every topic should included but be used as a reference for of interest Catholic topics. The original text is in the public domain but the only online version [6] is copyright. A word of warning though: because it was written to serve the Roman Catholic Church and reflect its doctrine nearly every article has a distinct POV and no article should included word for word. --Leonsimms 19:29, 29 July 2005 (UTC)

Yeah it would be a nice addition, I briefly talked about it with DanielCD but i dont know if he ever did anything about it. Bluemoose 09:08, 31 July 2005 (UTC)


I've just added the topic lists- arranged alphabetically to a new project page. I've also added the project to the Project Template. Any assistance is appreciated, though I'm hoping that clearning out Britannica and Encarta will clear out some of the remaining blues, but perhaps interested Catholics will work to add some info. Leonsimms 18:57, 3 August 2005 (UTC)

I've just added search capability to half of the topics listed in the CE (A-L). I would add it to the entire list, but I'm concered about page size and persons working on a limited connection. Adding three different types of search quadrupled some of the larger size pages from ~ 50K to about 200K. Has anyone run into this problem and what if anything can be remedied to alleviate the problem. Of course anyone interested in the project can choose a smaller page to work on but I know this won't be the first time that this project with it's long lists has run into this problem and possibly solutions. Does it make sense to break up the pages? Reflex Reaction 15:15, 23 August 2005 (UTC)
Removing all the blue links will make them much smaller (and removing the innappropriate "or" links too), these pages have an awful lot of blues. The britannica lists used to be up to 180kb, it was never really much of a problem, and I had dial-up when I first uploaded them as well! Martin - The non-blue non-moose 16:46, 23 August 2005 (UTC)
I don't want to take away from the other projects but hopefully this will spur other users to do clear out some of the blues. It's a pain to go back and forth between wikipedia and the CE site. I will make the entire list searchable. Reflex Reaction 17:40, 23 August 2005 (UTC)
Many blue links go to articles which badly need expansion. I was thinking of using the CE article to expand the University of Bologna article, for instance, which is a pathetic stub – this is after all the oldest university in Europe. I can easily imagine that the situation may be similar with other things. Uppland 17:57, 23 August 2005 (UTC)
At your suggestion I have been careful to put comments where it looks like wikipedia would benefit from some of the information from the CE. Reflex Reaction 14:30, 7 September 2005 (UTC)

I feel like a midget standing on the shoulders of giants. I have spent the last few weeks trimming blue links from the Catholic encyclopedia and it seems like that every third article has {{1911}} at the bottom of the page. The work everyone has done has been extremely impressive and thorough and I just wanted to put out my thanks. Reflex Reaction 14:30, 7 September 2005 (UTC)

Weisstein encyclopedias

Will be Eric Weisstein's online encyclopedias also included in this project? Samohyl Jan 14:02, 31 July 2005 (UTC)

Definately, i would love a science encyclopedia to be included. In fact i'll start compiling it soon.Bluemoose 15:00, 31 July 2005 (UTC)
Ill compile the science one, some one else do maths.Bluemoose 15:03, 31 July 2005 (UTC)
Actually, i will do it all. Bluemoose 15:16, 31 July 2005 (UTC)

Main reason for project

The only main reason I see for this Wiki project is to outperform the other encyclopedias rather than looking for a harmonic coexistence with them. --Abdull 21:34, 1 August 2005 (UTC)

Correct.Bluemoose 22:14, 1 August 2005 (UTC)
Alternatively, it could be considered that one of the aims of this project is to ensure that there are no major gaps in Wikipedia's coverage, and therefore help counter systemic bias. Bluap 08:45, 2 August 2005 (UTC)
There's an element of truth in both interpretations, though I think the second is more important than the first - it's not about being able to say "My encyclopedia is bigger than your encyclopedia", it's about making sure we haven't missed anything. Although, of course, the danger is that rather than eliminating systemic bias we instead end up mirroring the (unacknowledged) systemic biases of others. We've seen an element of that already, when people haven't been careful enough to modernise some of the more outdated 1911 and Nuttall entries, and there are inevitably also choices made by Britannica/Encarta/Hutchinson as to what they include. Every encyclopedia has biases - Wikipedia is the only one honest enough to acknowledge them. OpenToppedBus - Talk to the driver 11:22, August 2, 2005 (UTC)

Reason to be proud

Just a little bit of good news. 10 lists in 2004 have passed the 50% mark. While only one has passed 60 (and we haven't even concentrated on it yet), quite a few more are on the verge of hitting 50%, which means that we are halfway there! That is why, this week, I want to encourage people to check out some of the other lists too, to see if we can bring them all, or at least most of them, to this landmark figure. Danny 12:42, 2 August 2005 (UTC)

That is good news, lists were first uploaded on 25th feb, just over 5 months or 158 days ago, and we have virtually done 50%, in fact i would guess that if all blues were removed we would on average breach 50%. Question is how long does the second 50% take? Obviously there are almost no easy redirects, but then there are a lot more people helping out now.
The project page has been around since 11th june, 52 days, at which point we were 33% done. Before project we did 0.31% per day (avergage), now we average 0.32% per day(average), even though it has been getting harder due to less redirects and "higher hanging fruit"!
Apologies to those who hate over analysing things with statistics! Bluemoose 14:06, 2 August 2005 (UTC)
As they said about World War 1, it will be over before christmas. hopefully a bad analogy! Bluemoose 14:11, 2 August 2005 (UTC)
I've just pushed page 17 over 50% as well. Not really my own work - someone had found and annotated four redirects without actually creating them. OpenToppedBus - Talk to the driver 14:33, August 2, 2005 (UTC)

Wikipedia:Maintenance collaboration of the week

Wikipedia:Maintenance collaboration of the week is nominating for focus projects of the week, at the moment it is to wikify pages (see the Wikipedia:Community Portal). I have just nominated this project, go and vote fo it now! Martin (Bluemoose) 09:53, 5 August 2005 (UTC)

I think you're stretching the definition of "maintenance" a bit, but heck why not :). Pcb21| Pete 11:24, 5 August 2005 (UTC)

Another specialist work, Encyclopedia of Modern Jewish Culture

Those interested, please see User:Hoziron/List_of_entries_in_Encyclopedia_of_Modern_Jewish_Culture. I do not have access to the actual work. Based on Google searches, I have added identifying notes for the red links from A to K. --Hoziron 03:30, August 8, 2005 (UTC)

50% Yippee!

As i am sure you all noticed, Eugene pushed 2004 past 50% yesturday, which is great, but I wonder if we will get to 100% before wikipedia gets to 1 million articles?. I think it is important for our credibility that we do, as it will be a bit of a fly in the ointment if the general media report that we have 1 million articles yet don't cover all 100,000 articles that EB do.

Wikipedia:Million pool on average seems to predict May 2006 for 1 million, these lists have been around for 5 months, another 5 months (optimistic) takes us to early 2006, so it may be possible, but I have a feeling it will be close. Martin (Bluemoose) 16:09, 12 August 2005 (UTC)

We may ok if the sight of the finish brings on board a lot of johnny_cum_latelys hoping to share in the glory :). Pcb21| Pete 16:28, 12 August 2005 (UTC)
First of all, I can't claim the credit for the 50%; I just removed as many blue links as were needed...
We will not get all of them before May next year. The first half consisted mostly of redirects, and Nuttall/EB1911 articles. For the next 50%, we'll need to write real articles. Also fun, but much more work. Nevertheless, feel free to prove me wrong. Eugene van der Pijll 20:23, 12 August 2005 (UTC)
Redirects? Dang, that would be fun. I've been writing new articles, and it hasn't been all that easy. I'm hoping someone will add more content than I could, my articles probably suck. Check out the Golden Whistler page that I made last night. Jack Lumber, Pirate King 04:19, 13 August 2005 (UTC)
IMHO, that's a rather good article. It was only missing a taxobox, which I've just added. Writing these kind of articles is more useful to wikipedia than creating tons of redirects; but as you say, it's not nearly as easy. Eugene van der Pijll 08:04, 13 August 2005 (UTC)
To Eugene... I know it will be a tough job but on en.WP there are 1200 new articles per day. We only need to get about 12,000 over the next 8 months... i.e. 1 in 240 new articles need to be EB articles. Phrased like this, the target seems attainable. There must be lots more Wikipedians minded like us to create these sorts of articles.. we just need to draw people somehow! Pcb21| Pete 12:11, 13 August 2005 (UTC)
Yes, thats why i made this project the "maintenance collaboration of the week", and it does seem to have drawn in lots of new people, in fact we should soon be the most popular project in wikipedia at this rate (we have 70, stub sorting has ~120 people, though most have given up after the main list was finished). Martin (Bluemoose) 12:34, 13 August 2005 (UTC)

Warning

I just erased a bunch of links from one of the pages. Checking them against Britannica, I see that we really did not have links for some, and failed to include important information. Please be careful before erasing that each link is checked against the Britannica entry. It is worth taking a bit more time but doing a job well done than just erasing links. Danny 04:50, 13 August 2005 (UTC)

Agreed. If I find blue links that should not be removed, I always annotate them in the list - to prevent a less scrupulous person from removing them in error. However, I suspect that the mass-remove of links from the Encarta page isn't being quite as rigorous. And I'm sure that the first-round cull of the EB2004 links took out some duds (before these pages were created). Bluap 10:38, 13 August 2005 (UTC)
Even worse to my mind, some people are copying Nuttall directly, creating articles that border on nonsense because they are so dated. We must do something pro-active to stop this. If not, this project will end up doing more harm than good to the 'pedia! Pcb21| Pete 11:29, 15 August 2005 (UTC)
Perhaps we should remove the Nuttall links, now that we have both EB and Encarta to concentrate on. Bluap 11:55, 15 August 2005 (UTC)
It would be a sad to do that, but I think it is probably the best course of action, as there is little to gain from nuttall now. A compromise could be to ruthlessly remove all the dic defs and other useless links - which it seems to me there are more and more of. People who like importing public domain stuff would be much better off using the 1911 lists anyway. Martin - The non-blue non-moose 13:57, 15 August 2005 (UTC)

Substubs

I have seen contributors creating half-line substubs on topics from these lists (using the Created as part of the WikiProject Missing articles edit summary), and don't see the point in that when EB 1911 actually has more information that can be used. It will just give the mistaken impression that Wikipedia has an article on the topic, leading (presumably) to the items being removed from the lists. Isn't it better to work slower in creating new articles and actually make something of each articles? Uppland 06:11, 20 August 2005 (UTC)

I agree entirely. Ambi 07:03, 20 August 2005 (UTC)
I don't think an article has to be particularly great in order to be worth including. Pcb21| Pete 09:05, 20 August 2005 (UTC)
Generally short articles are fine, because they are labelled as stubs (no such thing as a substub anymore!), and expanded in the future, you could think of them as being basically a marker saying "we need an article here". But short articles from eb1911, when they could have been longer is bad, as it is unlikely that someone wanting to expand the stub would check to see if there is more information in the eb1911. However, virtually all the 1911 imports i have seen have been fine, in fact, people leave too much in, and don't remove the POV/out-of-date material, but this isnt so bad as it is much more fixable, as you can easily list all articles that have the {{1911}} tag.
In general the articles created here are way above the average quality of new articles. Note the above comment in "Examples?". thanks Martin - The non-blue non-moose 10:03, 20 August 2005 (UTC)
But short articles from eb1911, when they could have been longer is bad, as it is unlikely that someone wanting to expand the stub would check to see if there is more information in the eb1911. - That's what I was talking about. See the article Pasewalk, for instance. Uppland 11:22, 20 August 2005 (UTC)
Ah, now thats a little bit different, as information from eb1911 for places (and other dynamic subjects) is well out of date, to be safe i personally assume 1911 is wrong, until verified with another source. It may well be that the 1911 contained no useful info at all, eb1911 is much more useful for biographies, as (to paraphrase user:Womble) people don't change much when they're dead! In fact I have almost only used 1911 for biographies. Martin - The non-blue non-moose 12:56, 20 August 2005 (UTC)
For most cities, you're right. Especially Indian cities have almost no useful info in EB1911. Pasewalk is an exception, however. I've added most of the EB1911 article. Eugene van der Pijll 13:08, 20 August 2005 (UTC)
No, it's not really different. Although population figures, administrative divisions and such things may not be current, that does not make them "wrong". For most people who do not live in a particular place, the historical information may be just as relevant. Uppland 13:58, 20 August 2005 (UTC)
You are right, but it is difficult making a decent article out of it as; the context is out of date, there is a strong POV and often the ocr errors are so bad it's difficult to tell whats going on. Also, I do actually have a plan for a new project when this one is done; to check all articles with the {{Nuttall}} and {{1911}} tags for POV and ocr errors, just to make sure. Martin - The non-blue non-moose 14:14, 20 August 2005 (UTC)
A variation of this topic is also being discussed at Wikipedia talk:Nuttall Encyclopedia topics, and I'm not sure we have a consensus. I was concerned that a Nuttall-derived subsubsubstub could pre-empt use of 1911 material by turning the title blue, but is it true that there are no more Nuttall/1911 duplicate titles? David Brooks 18:29, 20 August 2005 (UTC)
No, as far as i know it isnt true, I followed a conversation magnus had with someone about making more crossover lists, but he was on holiday so couldnt do it, i expect when he is active again he may do a nuttallx1911 which would be great for that problem. Martin - The non-blue non-moose 18:36, 20 August 2005 (UTC)
Why is the 1911xNuttall list in the completed section then? Pcb21| Pete 22:12, 20 August 2005 (UTC)
My mistake, some other list was requested from magnus in that case. thanks pete. Martin - The non-blue non-moose 22:15, 20 August 2005 (UTC)
Confused. Is 1911xNuttall an empty set or not? David Brooks 04:28, 21 August 2005 (UTC)
(de-denting for readability) Yes it is the empty set. View the history of User:Magnus_Manske/1911xNuttall to see how well (or otherwise) the last set were done. Pcb21| Pete 07:45, 21 August 2005 (UTC)

Copyvio

Just so you all know; Isaiah Bowman, Judah Leon Magnes, Pirate Perch, Society of the Sacred Heart, Ferenc Herczeg, Arthur Erdélyi, Sandor Szalay, African literature were created by User:Dubaduba and are all direct copyvios, so don't delete them from the lists. Maybe we need bigger warnings not to copy text from other sources? Martin - The non-blue non-moose 08:54, 22 August 2005 (UTC)

Sheesh. You'd have thought that someone interested in contributing to this project would know all about copyright etc... Bluap 11:50, 22 August 2005 (UTC)

just congratulations

I'm a user from catalan wikipedia. I was lookking for some articles and i got to "more than 2 years wanted articles pages" I just got shocked! Two years and people didn't create the articles!! But then I got to this page (by maintenance tables), and I just can say CONGRATULATIONS. This is an excellent project!!!!

source of lists?

Can someone clarify how these lists were generated? Bluemoose?

From 216.146.93.139 (talk · contribs) which routes to corp.eb.com, i.e. the corporate office of Encyclopedia Brittanica. I have no opinion on what the reply should be, and in fact do not even know the answer, but I figure that recognizing who is asking the question could be important here. Dragons flight 00:26, August 24, 2005 (UTC)
I don't know how they were made. Martin - The non-blue non-moose 07:08, 24 August 2005 (UTC)
Then where did they come from? 14:43, 24 August 2005 (UTC)
Since anyone could tell from looking through the page history, I will tell you that they were originally uploaded by a user with the screen name Bogdangiusca (talk · contribs). Presumably he knows how they were made, and since it has never been discussed (to my knowledge) he may be the only one who knows for sure. Dragons flight 14:57, August 24, 2005 (UTC)

Ok, so someone asked a question. That is harmless enough. I am more worried right now about people posting copyrighted material, or removing items from the list without checking them. Danny 00:47, 24 August 2005 (UTC)

Thats exactly right, which is why i am try to get obvious copyvios speedy deletable (See Wikipedia talk:Criteria for speedy deletion). Martin - The non-blue non-moose 06:26, 24 August 2005 (UTC)
I agree it's harmless, but disclosure would be nice. I note that User:216.146.93.139 seems to have been used by at least two people; the first few contributions are silly vandalism, while the remainder are straightforward, with many adding or correcting content about EB itself. David Brooks 01:01, 24 August 2005 (UTC)
Also, in case you haven't seen it, at the top of Wikipedia talk:2004 Encyclopedia topics David Brooks 01:08, 24 August 2005 (UTC)
This is bizarre, someone from EB working on Wikipedia, whatever next! Martin - The non-blue non-moose 08:22, 24 August 2005 (UTC)
Someone from EB vandalising Wikipedia, in fact. See the first few contributions from that IP address. -- 220.239.76.246 10:53, 24 August 2005 (UTC)
Not necessarily the same person. David Brooks
Seems like someone from EB tried to test how soon difficult to spot error where corrected, was very impressed and started contributing. --R.Koot 14:35, 24 August 2005 (UTC)
The Britannica 2004 CD-ROM has some index files, which are in a special format, but as they're not encrypted and with a little patience the format can be hacked. :-) 195.212.29.67 09:37, 24 August 2005 (UTC)
Although doing so would almost certainly be a violation of the terms of use.
(The above comment was from the currently-discussed EB address.) Hypothetically speaking, if one got the CD from a public library, and one looked through the files on the CD without running the software, then one never would have agreed to abide by the terms of use. Just saying. – Quadell (talk) (sleuth) 18:12, August 26, 2005 (UTC)

There is an implied question here: is EB's list of encyclopedia topics copyrightable? The answer is not immediately obvious, even to someone well-versed in copyright law, but I think the prepondance of the case law indicates that such a list is indeed copyrightable. But I don't think these Wikipedia pages violate that copyright. I'll explain.

(First, as a side note, be aware that using this list to fill in holes in Wikipedia is not a violation of anything. Only publishing the list could be. So Wikipedia could hypothetically be required to delete the list itself, but the articles created because of this list would not be in jeopardy.)

In Feist Publications v. Rural Telephone Service it was decided that a mere collection of information was not copyrightable (in this case, a list of all subscibers and their phone numbers), since there was not even a modicum of creativity in assembling the list. The list included every subscriber, and no creative decisions were made in assembling the list. The debate over at Talk:FHM-US's 100 Sexiest Women 2005 seemed to reach the conclusion that FHM's list was uncopyrightable because it was determined from user surveys, and not by FHM's own decisions, and therefore had no creativity in its creation (although one lawyer at Wikipedia disagrees).

The list of articles at EB, however, is chosen by the company. It seems to me like an obviously creative decision. Is the 4th Earl of Lancaster worthy of inclusion? This is not an automatic choice, and, arguably, exceeds the minimum threshold of creativity required for copyright protection. (American Dental v. Delta Dental[7] found that a taxonomic list of dental codes is copyrightable because of the choices made in assembling the taxonomy.)

But this page is not a verbatim copy of EB's list. Not only are the entries often renamed, but many encyclopedia entries are omitted. Any claim of infringement would be vastly weakened by this fact.

However, the creator of this list might have violated copyright law in assembling the list. (Although I doubt it. I think fair use would apply.) With that in mind, if I had assembled the list, I would personally not describe how I assembled it without talking to a lawyer. – Quadell (talk) (sleuth) 19:16, August 26, 2005 (UTC)

Yes, i would agree with all that. But note that this guy from EB is clearly not here officially or anything, he just seems interested, and actually quite friendly, i dont think there is any reason to worry at all.Martin - The non-blue non-moose 19:23, 26 August 2005 (UTC)

Deletion of Encarta and Britannica projects

Could someone explain the reasoning behind the deletion of the Encarta and Britannica projects? The only (partial) explanation I've found is on this web page. Mateo SA | talk 15:53, August 28, 2005 (UTC)

I also feel a more rounded-out explanation would be extremely helpful, but that is they only information that has been made available to the masses so far. Pcb21| Pete 16:08, 28 August 2005 (UTC)
See also User_talk:Jimbo Wales. Pcb21| Pete 16:25, 28 August 2005 (UTC)
Um, Wikipedia:2004 Encyclopedia topics has been deleted? Really? David Brooks 05:30, 29 August 2005 (UTC)
Yes it was. It was restored temporarily for, I think, technical reasons, but it will be re-deleted soon. – Quadell (talk) (sleuth) 11:08, August 29, 2005 (UTC)

I am away at the moment, so dont have time to comment properly; while this is annoying, if it is deemed a copyvio risk, then i have no problem in the lists being removed, hopefully while we find a way around the problem. In the meantime the public domain lists are absolutly fine (i assume), good luck guys, i'll join the melee properly next thursday, thanks. Martin - The non-blue non-moose 10:03, 29 August 2005 (UTC)

Merge

Given the possible copyright violation of the following pages

As well as the deletion of two previous project pages (Encarta and Columbia)

I think that the list of topics from these pages need to be reorganized or merged into a new fashion. I am willing to do the work of creating the list, but I want to be sure that

  1. My work will not be deleted
  2. I am not duplicating the efforts of someone else
  3. I am doing this in a way that will make work easier for everyone else

I have several proposals.

  1. Lists that have copyright concerns are merged into the General Encyclopedia lists. This would eliminate any copyright concerns because we are creating our own lists of topics that should be covered. This of course would require a restart of the General topics progress. I also think that blue links should not be eliminated from the list. The experience with Encarta and 2004 has shown that oftentimes a "blue link is not a blue link" because the material covered is not the same. This would require lots and lots of trimming, as well as the referencing of multiple encyclopedia sources to make sure that there is complete coverage. Because of the multiple searches(google, eb, Weisstein, gwp, Columbia? Encarta?), each page would have to be relatively small ~200, so that load times are not excessive. Smaller pages also seems to spur on faster work.
  2. Another "General" list is created from various topics. This would eliminate the concerns of restarting the General topics list, but all the other concerns (trimming and searching) would remain.
  3. Wait. There is still LOTS of work to be done with many other projects without copyright concerns, 1911, Nutall, Jewish and my pet project the Catholic still have lots of redlinks to be made blue.

I personally favor the first proposal. There should only be one list of "General topics" so that a link only has to be trimmed once and not from several pages. Can you imagine a General_Topics_1 X General_topics_2 cross? The problem is that I don't know the "multiple sources" that were used to create the General list and whether it was a strict redlink list or whether the list was somehow vetted before it was release. I wasn't here when the project started but I know that a lot of work was put into creating and clearing these list and I don't want to have do double work but I can't see any other way around it. Reflex Reaction 15:31, 29 August 2005 (UTC)

I think there would still be copyright problems if we simply merge EB's copyrighted list, Encarta's copyrighted list, and Hutchinson's copyrighted list. To be guaranteed legal, we'd have to create an all-new list that uses those lists as sources but doesn't "copy" any of them directly. This is hard to do, and difficult to prove, but I'm working on it right now. See User:Quadell/topics for what I've done so far. If you have questions about the list, I'd prefer you e-mail me with them. I'd appreciate any help you could offer or suggestions you could give. – Quadell (talk) (sleuth) 17:06, August 29, 2005 (UTC)
(Discussion taken offline) – Quadell (talk) (sleuth) 13:21, August 30, 2005 (UTC)
As well as merging the lists, we could also only ever upload a few hundred at a time, this would hopefully make it even safer, although would mean one person would have to update it everytime it ran out of red links. Martin - The non-blue non-moose 10:48, 31 August 2005 (UTC)

Merging Languages

Hey - I'm a bit new here, so if this has already been done, then I apologise in advance. I read somewhere a note that just because an entry is listed as a "missing entry", it may still exist under a different language. I think it'd be beneficial to make another list of entries, being those that do exist, yet do not exist in English. In fact, a list of all the topics in every language could be generated automatically, and each language could have its own list (also automatically generated) which would display all of the wikipedia entries that exist only in other languages. This could also be done in another pair of automated lists, where stubs (instead of wholly missing articles) could be listed - that is, list all the non-stub articles that other languages have, that the language in question does not have a non-stub article for. When clicking on items in these lists, you might get another list, which would link to those articles which were found in other languages.

I hope that was somewhat clear - at any rate, I believe it would be a very useful list, as people who speak more than one language could also do translation work. This would open up another field of opportunity for people to contribute, and more ways of contribution are always good - perhaps some people who have not been contributing would begin to do so if this particular method of contribution were available, since perhaps they feel more comfortable in it, or would like to practise translation, etc.

On the same note, there should be a watch-list of translated articles (from one language of wikipedia to another). I'm not sure if this exists either, but it would be good to have - and if it were displayed somewhere convenient, then that would also be beneficial. For example, "edit this page" is very evident on an article. "Translate this page" should be just as evident, and perhaps even another tab at the top could be added for it.

The Wikipedia project is absolutely fantastic, in my *humble* opinion - the goals are very direct and clear. Among those goals are to provide the free equivalent of every single topic or article that can be found in commercial counterparts. I'm not sure if there's a list of goals, or if this is already one of them - but I believe another goal should not only be to provide an encyclopedia for every language, but also to have every one of those languages as complete as the rest. These automated lists could help with that.

"List of Encyclopedia Topics" in the merge?

The merged file is just about ready. This new encyclopedic list will contain entries from the following copyrighted encyclopedias:

  • Encarta Encyclopedia
  • Encyclopaedia Britannica 2004
  • Columbia Encyclopedia
  • Hutchinson's Encyclopedia
  • Encyclopedia of Modern Jewish Culture
  • McGraw-Hill's Encyclopedia of Science and Technology

It also contains info from a few other smaller lists as well. This master list will contain around 80,000 entries before duplicates are removed.

There also exists Wikipedia:List of encyclopedia topics, although some have debated its importance in comparison with the above. I don't know what sources were used for creating it. So the question is, should the two general encyclopedic lists be merged? (This would slow things down a bit.) Or should they stay separate? Comments are welcome. – Quadell (talk) (sleuth) 13:36, September 1, 2005 (UTC)


Please vote Support or Oppose for the merging of the proposed General topics list with the existing General Topics list Voting will end in 5 days. 20:30, 6 September 2005 (UTC).

Support

  1. Support for reasons given above Reflex Reaction 20:24, 1 September 2005 (UTC)
  2. Support Best as a list of missing topics to work on. If you can't figure out what a topic means or where it came from, it's probably a good one to research. +sj + 18:24, 2 September 2005 (UTC)

Oppose

  1. I would prefer to keep them apart, although if it is significantly safer legally to merge them then that is acceptable. Martin - The non-blue non-moose 20:34, 1 September 2005 (UTC)
  2. It's already a long list. I personally think it's best to keep things tightly focused. – Quadell (talk) (sleuth) 22:37, September 1, 2005 (UTC)
  3. The current list is a mess; hopefully this wouldn't be the same. Keep them seperate. Ambi 23:02, 1 September 2005 (UTC)

Comments

What about Nuttall/1911?
Or to put it another way: The answer depends on what we are trying to achieve. If we really prefer separate lists but are only merging them because we have to, then we should keep the lists separate. If we genuinely want to create a list of missing topics to work on then we should merge the lot. Pcb21| Pete 13:56, 1 September 2005 (UTC)
Personally I would strongly prefer separate lists. In particular, Nuttall and 1911 are much easier to work with separately because you know where you can find information from. I'm not a fan of the quality of Wikipedia:List of encyclopedia topics (though a lot of the real dross has been stripped out of it by now) and think that merging it in will actually detract from the new list (I might have argued for leaving out Hutchinson for the same reason, but there are few enough of those that it's no big deal).
In another issue, given that he will run the legal risk, and it was his intervention that prompted us to go down this route, can I recommend that we get Jimbo's go-ahead that he's happy with the new list? It would be a bit silly to spend a lot of time formatting and editing the new list only for our benevolent dictator to come along and say "nope, I still think it's too risky" and delete the lot. --OpenToppedBus - Talk to the driver 16:00, September 1, 2005 (UTC)
My comments and suggestions are above, but I think that lists that can stay separate should be separate so that we can closer know the "intent" of the article. With the source available, I know exactly what material was covered by the encyclopedia, it is much more difficult a compiled list to know the source of the material and to consequently know if it is appropriate to remove it from the list. Reflex Reaction 16:10, 1 September 2005 (UTC)
i.e. the first of my two suggestions. If we go down this route then I agree with OpenToppedBus's suggestion that we run it by the legal folks. Pcb21| Pete 16:16, 1 September 2005 (UTC)
Agreed, one extra safety feature we could offer would be to only upload small portions of the list at a time. Martin - The non-blue non-moose 16:28, 1 September 2005 (UTC)
OK, I've asked Jimbo to have a look over this discussion. --OpenToppedBus - Talk to the driver 16:34, September 1, 2005 (UTC)
Someone might also want to consider sending a mail to juriwiki-l AT wikimedia PUNTO org which is the list for legal issues. The membership and archives are restricted, so we can't see their discussion, but I think anyone can still send mail to that list. Presumably their opinion would be a strong factor in influencing what Jimbo might do in cases he is unsure about. Dragons flight 17:59, September 1, 2005 (UTC)
I don't understand (not that I have to, because I'm only working on the free sources now). I think you are suggesting just making the 6-way merge. If you deliberately discard the source of any given entry, how will you know which source(s) to check and make sure you're describing "the same John Doe"? Do you check all six? David Brooks 18:05, 1 September 2005 (UTC)
The whole concept of "the right one" never made much sense to me. If the EB talks about a King John Doe of England in its King John Doe article, and Encarta talks about a different king in its King John Doe article, then ideally we should have both. This was true when the lists were separate just as much as when they're merged. I'd say checking any encyclopedia should be fine, or just use your head: if the existing article is about Ashlee Judd's uncle or a minor fanfic character, it's probably not the encyclopedic one. – Quadell (talk) (sleuth) 18:21, September 1, 2005 (UTC)
A modest proposal. Rather than simply merge the liats, if we are reluctant to indicate the source of the titles, let's at least inicate, if not split by, the number of sources. An missing item that is in 8 sources is likely to be more important than one that is only in one.
Rich Farmbrough 19:48, 1 September 2005 (UTC)

Archived talk page

This page is getting quite lengthy and finding the new discussions are becoming quite difficult. Could an experienced wikipedian with some history in the project move some of the older discussions to an archived talk page? Presuambly this very suggestion would be moved to that same page. Reflex Reaction 14:45, 7 September 2005 (UTC)