User:ST47/perlwikipedia/Bugs
From Wikipedia, the free encyclopedia
This page is a makeshift list of bugs in perlwikipedia for those without Google Code accounts or that don't want to create them.
[edit] New
- Description: get_text does not work when on a non-English wiki (example bug)
- Description: linksearch fails when more than 500 results found
- Summary: When searching on a link that contains more than 500 results, the linksearch methods fails with the following error:
Can't call method "decoded_content" without a package or object reference at /usr/lib/perl5/site_perl/5.8/Perlwikipedia.pm line 531.
- Example code:
my @results = $bot->linksearch("en.wikipedia.org/wiki/H");
- Thanks. -- JLaTondre 18:20, 16 January 2008 (UTC)
- Summary: When searching on a link that contains more than 500 results, the linksearch methods fails with the following error:
[edit] Open
[edit] Closed
- Description: When running on ActivePerl on a Windows machine, the get_text method hangs in an infinite loop.
- Summary: This loop seems to occur because the condition on line 295 is never met, because $res->content contains garbled text. (Looks like an encoding problem.)
- This occurs on my computer running ActivePerl on Windows, with the latest versions of all modules. – Quadell (talk) (random) 16:36, 6 June 2007 (UTC)
- A work-around has been found! Shadow1 suggested I go through Perlwikipedia.pm and change all instances of ->content to ->decoded_content. This fixes it. I'm not sure if a more seamless solution should be developed before closing this bug though. . . – Quadell (talk) (random) 19:55, 6 June 2007 (UTC)
- The code at http://perlwikipedia.googlecode.com/svn/trunk/Perlwikipedia.pm has a bug, in the _put subroutine one declares the variable $res twice. That is easily fixed, and I can do it since I have access to the repository, but I am not sure if the googlecode version of the code is the most recent one. Oleg Alexandrov (talk) 15:53, 20 June 2007 (UTC)
-
- I fixed it myself. Oleg Alexandrov (talk) 02:07, 23 June 2007 (UTC)
-
3. Description: get_text fails on certain UTF-8 characters
-
- Summary: If you attempt to retrieve the text of a page such as Š, the following error is produced:
Can't escape \x{0160}, try uri_escape_utf8() instead at {path}/perlwikipedia/Perlwikipedia.pm line 64
-
- Test Case: The following code segment demonstrates the problem.
my @results = $bot->what_links_here("Caron"); for my $result (@results) { my $page = $result->{title}; print "Getting $page\n"; my $text = $bot->get_text($page); }
-
- Resolution: I patched my copy of Perlwikipedia.pm by doing exactly what the error message states. I don't know if this is the best approach, but it works.
$ svn diff Index: Perlwikipedia.pm =================================================================== --- Perlwikipedia.pm (revision 88) +++ Perlwikipedia.pm (working copy) @@ -7,6 +7,7 @@ use XML::Simple; use Carp; use Encode; +use URI::Escape qw(uri_escape_utf8); our $VERSION = '0.90'; @@ -61,7 +62,7 @@ my $extra = shift; my $no_escape = shift || 0; - $page = uri_escape($page) unless $no_escape; + $page = uri_escape_utf8($page) unless $no_escape; $page =~ s/\&/%26/g; # escape the ampersand my $url =
-
- Thanks. -- JLaTondre 12:00, 18 July 2007 (UTC)
- I applied the patch. I tested it too. Thanks! The new revision is available at the Google code repository for Perlwikipedia. Oleg Alexandrov (talk) 03:28, 19 July 2007 (UTC)
4. Description: get_pages_in_category() does not return images in the category
5. Description: get_history failing on articles with UTF-8 characters in the name
-
- Summary: For articles with UTF-8 characters in the name, such as Kashō, get_history fails. The query does not retrieve the results as the UTF-8 characters need to be escaped. I added $pagename = uri_escape_utf8($pagename); to the start of get_history and it fixed the problem. This same problem will occur with any other function that uses _get_api. It cannot be fixed by simply escaping $query within _get_api as that will also escape characters that shouldn't be (ex. the & in &action).
- The following is the diff for the change I made. Thanks. -- JLaTondre 00:14, 27 August 2007 (UTC)
236a237,238 > $pagename = uri_escape_utf8($pagename); >