Wikipedia talk:Persondata/archive2
From Wikipedia, the free encyclopedia
[edit] Template:Pharaoh Infobox
The {{Pharaoh Infobox}} template is including {{Persondata}} at the top of pharaoh articles... Mike Dillon 00:46, 13 January 2007 (UTC)
- yes, we need to resolve this. I asked on the Pharaoh infobox talk page that the Persondata template be removed from the Infobox. We need to follow up on this. --Rajah 22:05, 1 April 2007 (UTC)
[edit] Template:Persondata edit request
Could a sysop please add the line <!-- Metadata: see [[Wikipedia:Persondata]] --> to the usage section on {{Persondata}}, right after the <pre> tag? This would make it in line with the example given in the Wikipedia:Persondata#Using the template section of this page, and would make it easier for those who don't know about this system to figure it out. Thanks. Picaroon 04:25, 14 January 2007 (UTC)
- Done. Luna Santin 19:39, 15 January 2007 (UTC)
- Thanks. Picaroon 20:40, 15 January 2007 (UTC)
- Adding the comment to the template doesn't actually do anything as the comment is not viewable either in the article view or the editing view. Perhaps it would be useful to add an actual note into the template that is not an HTML comment. Kaldari 23:18, 24 January 2007 (UTC)
- An HTML comment is the only way to handle it. Persondata is not visible; it's just a textual note within the window as to what it is. Ral315 (talk) 00:35, 25 January 2007 (UTC)
- My point is the HTML comment is only useful if it is outside the template, rather than inside. If it's inside the template, you'll never see it since HTML comments in templates are not displayed in editing mode. Thus the recent edit to the template should be reverted. Kaldari 02:52, 25 January 2007 (UTC)
- Compare it yourself: before after
- It's useful for copy/paste. Editors who are unfamiliar with {{persondata}} know where to have a look at for more information (because HTML comments _are_ visible in edit mode). --32X 05:07, 25 January 2007 (UTC)
- My mistake. I thought the usage notice had been added to the template itself. Kaldari 18:14, 25 January 2007 (UTC)
- My point is the HTML comment is only useful if it is outside the template, rather than inside. If it's inside the template, you'll never see it since HTML comments in templates are not displayed in editing mode. Thus the recent edit to the template should be reverted. Kaldari 02:52, 25 January 2007 (UTC)
- An HTML comment is the only way to handle it. Persondata is not visible; it's just a textual note within the window as to what it is. Ral315 (talk) 00:35, 25 January 2007 (UTC)
- Adding the comment to the template doesn't actually do anything as the comment is not viewable either in the article view or the editing view. Perhaps it would be useful to add an actual note into the template that is not an HTML comment. Kaldari 23:18, 24 January 2007 (UTC)
- Thanks. Picaroon 20:40, 15 January 2007 (UTC)
Hi Kaldari. Could you edit the template to link to Template:Persondata/doc, following the template doc page pattern? Mike Dillon 18:46, 25 January 2007 (UTC)
- Where's the actual benefit of it? The doc page contains less information. --32X 22:22, 25 January 2007 (UTC)
- I'm not sure what you're asking, but the benefit over the current situation is that the doc portion will be editable by anyone while keeping the template itself protected. Mike Dillon 22:53, 25 January 2007 (UTC)
- Ok, that's an argument. But wouldn't it be better to set a redirect to Wikipedia:Persondata since that page is all about the template? (If one knows about the template, the short form for copy/pasting is enough; otherwise the introduction is a "must read".) --32X 23:41, 25 January 2007 (UTC)
- I'm not sure what you're asking, but the benefit over the current situation is that the doc portion will be editable by anyone while keeping the template itself protected. Mike Dillon 22:53, 25 January 2007 (UTC)
[edit] Siblings and parents
Can we add siblings and parents as a cat? That way if the info is removed from the article, at least the info will be easily found by those who need the info. The info doesn't have to display, but its a good place to store it. The biography infobox has this information but it displays all answers. This way the info could be not displayed and still be available for researchers. Answer at my page please. --Richard Arthur Norton (1958- ) 20:45, 24 January 2007 (UTC)
- This seems like a bad idea; persondata has been standardized for the most part. Ral315 (talk) 23:37, 25 January 2007 (UTC)
[edit] hCard microformat
It should be relatively trivial to arrange to have "Persondata" published with hCard microformat mark-up, simply by applying some standard class
names to its containing elements. The data coudl then be extracted by a variety of parsing tools. Please see also Wikipedia:WikiProject Microformats Andy Mabbett 20:59, 28 January 2007 (UTC)
[edit] Project proposal to link to WorldCat Identities
Anyone interested in a proposed project to link to WorldCat Identities is invited to leave comments or sign up at the project proposal page. WorldCat Identities provides pages for 20 million 'identities' (authors and persons who are the subjects of published titles in WorldCat). Several thousand of these pages provide links to Wikipedia biographical pages: providing links in the other direction would allow readers of Wikipedia biographical articles to move straight to associated library information held in WorldCat libraries. Dsp13 15:17, 20 February 2007 (UTC)
[edit] Template:Birth date and age
For the "Date of Birth" parameter, should we use {{birth date and age}} or should we stay clear of this? --WillMak050389 01:10, 5 March 2007 (UTC)
- I would avoid it. Any application using Persondata is likely to be working with the wiki-text directly, which means it will see {{birth date and age|1967|07|15}} rather than July 15, 1967 (age 39). The idea with Persondata is to make it easier for automatic extraction of data; either of these is yet another format your parser has to handle. In any case the age is more useful to human readers; given the birthdate any program can easily calculate the age. Dr pda 01:39, 5 March 2007 (UTC)
[edit] Half-automatic tagging with persondata-tool
I come from the german Wikipedia. At January 24th 2007 126.332 from 133221 (94,8 %) persons are tagged with persondata. A very useful utility is the persondata-tagging-tool from Apper. It extracts automatically birthdate, birthplace etc. from the article and the only thing the user has to do is to check if it's correct and then save it. If someone of your project asks him, maybe he will help you with his tools so you can tag your articles much easier and faster. Bones 77.180.105.11 22:57, 15 March 2007 (UTC)
- I'm actually almost finished writing a script to do a similar thing, although it requires the article to have an Infobox from which the data is then extracted, rather than getting the data from the lead of the article. However there are still around 50 000 articles using one of the top 20 or so people-infoboxes (e.g. {{Infobox Football biography}}, {{Infobox musical artist}}), which is about ten times the current number of articles with persondata.
- It is more difficult to extract the information from the text of the article (i.e. without an infobox) compared to the de wiki, since on the en wiki the birth/death places are typically not given in a predictable place, i.e. the opening sentence. Compare the first sentences of de:Alfred Hitchcock and Alfred Hitchcock
-
- Sir Alfred Joseph Hitchcock KBE (* 13. August 1899 in Leytonstone; † 29. April 1980 in Los Angeles) war ein Filmregisseur und Filmproduzent britischer Herkunft.
-
- Sir Alfred Joseph Hitchcock, KBE (August 13, 1899 – April 29, 1980) was a highly influential film director and producer who pioneered many techniques in the suspense and thriller genres.
- Hopefully I will have time this weekend to get the script finished. Dr pda 01:27, 16 March 2007 (UTC)
-
- OK, I've finished the script now. Instructions for use are at User talk:Dr pda/persondata.js. It also includes a tidied-up version of the javascript above for turning persondata on/off without editing your monobook.css. Sample results of using the script are here.
-
-
- This is a very nice tool - thanks!
However, at present it seems to insert the persondata at the top of the article, rather than before categories.No, sorry, it puts everything in the right place! Dsp13 12:17, 19 March 2007 (UTC) - Or rather, it puts the persondata in almost (but not quite) the right place whenever there is a defaultsort template introducing the categories - see my query below. Dsp13 21:51, 19 March 2007 (UTC)
- This is a very nice tool - thanks!
-
-
- By the way I've also got the extraction from the XML dump more-or-less working by modifying the scripts linked at WP:PDATA#Extraction from the XML dump (the last step is deciding whether to write code to parse the dates which are currently giving errors, or just change the data in the article). I don't have an appropriate place to put the scripts on the web, but if anyone wants a copy email me. Dr pda 01:50, 19 March 2007 (UTC)
- User:SEWilco has left this plea, which sounds reasonable, on my talk-page: 'Please do not have your script call itself "this script". That makes reading and searching edit summaries much more difficult.' Could a simple alteration to the script be made? Dsp13 09:14, 1 April 2007 (UTC)
-
- I've changed the edit summary; it now reads adding persondata using User:Dr pda/persondata.js. I'm not entirely convinced the previous edit summary was difficult to read (compare 'reverted vandalism using popups', 'renaming category per CFD with AWB' etc); anyone interested in knowing which script would click the link, anyone not interested would just be able to see it was done with a script. As for causing difficulty in searching through edit summaries, there should only be one instance of it in an article's history. Users of the script will need to refresh their monobook.js to pick up the change. Dr pda 12:24, 1 April 2007 (UTC)
-
-
- thanks! agree with you, but nice to keep everyone we can happy! Dsp13 13:06, 2 April 2007 (UTC)
-
[edit] Query re positioning of persondata before categories
Where categories are immediately preceded by a Template:DEFAULTSORT, should the persondata go between the defaultsort template (which seems the strict reading of 'immediately before categories', but confusingly splits the defaultsort template from the categories it is concerned with) or immediately before the defaultsort template (which seems more natural to me, but should be specified if that is what is to be recommended)? Dsp13 12:33, 19 March 2007 (UTC)
- In my opinion {{DEFAULTSORT}} is not a real but a meta-template which directly belongs to categories. I don't see the problem here, but to avoid any confusion I've added a comment. --32X 21:19, 19 March 2007 (UTC)
- Thanks. I've modified the script to place the persondata before the {{DEFAULTSORT}} template if it exists. You may need to refresh your monobook.js to pick up the changes. Dr pda 23:04, 19 March 2007 (UTC)
[edit] If you see someone removing persondata templates...
...you can now tell them not to do it again, by putting {{subst:pdataremove-warn}} on their user talk page. They will also be pointed here for more information on persondata. Resurgent insurgent 03:33, 25 March 2007 (UTC)
[edit] Why is persondata separate to infobox?
Further to my above comment about hCard, please can someone explain to me the purpose and advantage of having persondata in a separate, hidden-by-default table instead of having the same, standard fields in the output of the various infobox templates? What tools exist to parse persondata, inside or outside Wikipedia? Andy Mabbett 00:59, 26 March 2007 (UTC)
- The {{persondata}} isn't a real information box but meta data. It was introduced for the first DVD of the German Wikipedia. The data field is pretty easy accessible with direct SQL (when you have downloaded an image) and therefor allows search operations. With a large article base (more than 100,000 in de.WP) it allows you to do SQL operations like f.e. to search for articles of birth places which aren't written yet. Some time ago I've read about several tools, but because I didn't felt the need I haven't used them. --32X 18:50, 28 March 2007 (UTC)
-
- Thank you for the explanation. The use-case makes sense, but it seem to me that this could be achieved just as easily, by using hcard, and hCard-like classes, in infoboxes, instead of repeating the information separately; and that that would have additional advantages for readers and editors, through greater interoperability with other tools and websites and ease of authoring. It would also facilitate persondata-like metadata for organisations and venues, though their infoboxes. I'm happy to advise further, if anyone's interested in pursuing this possibility.Andy Mabbett 19:16, 28 March 2007 (UTC)
-
- To clarify issues in my own mind, I've drawn up a comparison of persondata and hCard properties, on the microformats wiki. Andy Mabbett 19:49, 28 March 2007 (UTC)
- A good reason is that someone using Pesondata usually has read this page and knows what they're doing. It is far more common for people to mess up and misuse infobox, which would garble the metadata.Circeus 19:03, 28 March 2007 (UTC)
-
- Like any bad edit, surely that can be remedied? Andy Mabbett 19:16, 28 March 2007 (UTC)
The issue of persondata vs infoboxes has been raised several times on this talk page, see #Use inside implementations of other templates, #Not picked up by Google?, #Hidden Metadata, #Revisiting Infobox Person and #Why is this seperate from Infobox Person?. Some of the main arguments given against combining them are
- This would require every biography to have an infobox, which many editors are opposed to.
- There are a large number of different infoboxes (approx 160), not all of which have all the fields of persondata, and which currently vary greatly in the names for the fields they do have.
- Persondata takes names in the format surname, firstname in order to be able to create an alphabetical list by surname.
There are examples at WP:PDATA#Extraction of persondata of how to extract persondata from an SQL database, or scripts to extract and parse it from the WP XML dump and insert it into a mySQL database, on which you can then run all kinds of queries (these scripts are written for the de wiki but I have more or less adapted them to the en wiki following the hints there, see my comments above).
I notice that your comparison of infobox/persondata/hCard at the microformat wiki is expressed in terms of the rendered (X)HTML of the page; both the previous methods for extracting persondata work with the raw wiki markup, i.e DATE OF BIRTH = 22 May 1977
rather than <abbr class= "dday" title="1977-05-22">22 May 1977</abbr>
. Using hCard would then seem to imply a lot of HTML-scraping to get the data, rather than using the periodic database dumps. (there are over 200,000 biographies, though admittedly only a quarter or so have infoboxes and only about 7000 currently have persondata.) Looking at the list of hCard implementations here it seems that most of these implementations deal with recognising hCards on an individual webpage/converting to vCards/adding to address books etc, rather than dealing with large collections of hCards (which would be the end goal of an equivalent to persondata), although I suppose some of the PHP tools could also be used to populate a database. I also notice that hCard does not yet support the date of death and place of birth/death fields, which would seem to argue against its immediate implementation in place of persondata. Perhaps the best way of combining persondata with hCard (if you want to go there at all) would be, as you originally suggested, adding extra class tags in the persondata template itself. Dr pda 15:10, 31 March 2007 (UTC)
Thank you for your detailed response. I appreciate that this must be old ground for some people, but I trust that you will agree that consideration of microformats makes it worth revisiting/ I'll address your points as bullets, for the sakes of convenience and clarity:
- "This would require every biography to have an infobox, which many editors are opposed to" - I would question why they're opposed, and whether they're perhaps putting personal (aesthetic?) preferences before the convenience of users. That said, perhaps, one day, it might be possible for user preferences to include a "do not display infoboxes" option, like the current "do not show TOCs" option.
- "There are a large number of different infoboxes (approx 160), not all of which have all the fields of persondata, and which currently vary greatly in the names for the fields they do have" - I think there's a case for some standardisation here; perhaps a root "persondata" template, to be included in other biographical infobox templates, in the same way that "coor" is included in a number of other location- related infoboxes.
- "*Persondata takes names in the format surname, firstname" It's possible for software to convert for one format to the other; or for the data entry to be in to (or more) fields (there's experience of doing this for the name field in hCard).
- It should be possible for XML to be dumped from infoboxes/ hCards if required.
- it seems that most of these implementations deal with recognising hCards on an individual webpage" - most, but not all, and thee just the "early adoptions" there's - deliberately - plenty of scope for other use cases.
- I also notice that hCard does not yet support the date of death and place of birth/death fields" - yes but the comparison page you cite suggests a work-around for that.
- adding extra class tags in the persondata template itself" hCards (indeed, all microformats) are intended for data that is visible on the page; not for hidden metadata
Finally, being naturally lazy, I believe strongly in both not reinventing the wheel, and not doing work (i.e. entering data) twice.
- Cheers, Andy Mabbett 19:33, 31 March 2007 (UTC)
P.S. Even while I was typing the above, The Anome was adidng, on the Microformats Project talk page:
This a bootstrapping effort at the moment, and you won't see any extra utility in the very short term: but once there's a substantial amount of semantically-tagged content on Wikipedia, some very interesting things will start to happen...
- Andy Mabbett 19:39, 31 March 2007 (UTC)
[edit] Persondata box & succession box display
In the case of Victor Hugo, the displayed persondata box gets mixed up together with an immediately preceding succession box. Anyone know why, or how to fix it? Dsp13 12:12, 31 March 2007 (UTC)
- There was a missing {{end box}} template after the succession box. It's fixed now. Dr pda 15:10, 31 March 2007 (UTC)
[edit] Gregorian/Julian calendar shift
How best to handle old-style dates? At the moment with Samuel Johnson I've left a template for old-style dates in his birth year, but (per discussion of dates above) I'd rather leave something more transparent in the wikitext. Dsp13 12:54, 31 March 2007 (UTC)
- I think the way you have handled it is best for now. --Rajah 05:30, 2 May 2007 (UTC)
[edit] Transcluded persondata?
Ramesses II has persondata somehow 'transcluded' onto the page. I'm not quite sure how this works, or it it's desirable. Any thoughts? Dsp13 21:28, 1 April 2007 (UTC)
- It is because the Pharaoh Infobox contains the persondata template. Wikipedia_talk:Persondata#Template:Pharaoh_Infobox --Rajah 23:25, 1 April 2007 (UTC)
[edit] Automatically adding Persondata from German Articles
Wouldn't it be possible (and infinitely faster) if we had a bot/script that could just translate the German persondatas into English? The German articles are already mapped to the English ones, the fields are the same (converting dates should be a breeze) and the only hard parts would be locations/descriptions/names. Does this sound like a good idea? --Rajah 06:42, 4 April 2007 (UTC)
- Yes. No point in duplicating effort, and only the name and description fields really need translation, although putting it all together (following interwikis, extracting persondata, converting, and inserting) does sound kind of troublesome. --Gwern (contribs) 15:26 4 April 2007 (GMT)
- Yeah, that's a great idea. Sounds like a challenging bot to write though. Kaldari 15:30, 4 April 2007 (UTC)
- Sounds like a great idea! Should be mentioned on Wikipedia:Bot requests. MahangaTalk 22:58, 14 April 2007 (UTC)
[edit] Other metadata information
Is there any other metadata templates? Is there any project to make more relevant metadata templates or does the microformats project pretty much take up this area? Remember 16:13, 7 April 2007 (UTC)
- No, sadly, an organized metadata does not yet exist on wikipedia as far as I know. The microformats, persondata, geodata, etc. movements are all balkanized at present. (Not that that is a bad thing necessarily.) I'm actually in favor of articles having a separate metadata page a la talk pages, but that's just my two cents. --Rajah 05:26, 2 May 2007 (UTC)
- You may want to look at Extension:Semantic_MediaWiki depending on your level of interest. --Rajah 22:18, 9 May 2007 (UTC)