User talk:Flcelloguy/Tool
From Wikipedia, the free encyclopedia
*sulk* *whine* Why does Kate's tool have to be down? Anyway, you may want to mention that the user needs to copy-n-paste from a specific URL, like this: http://en.wikipedia.org/w/index.php?title=Special:Contributions&target=Flcelloguy&offset=0&limit=5000. Also, for those who might have a favorite editor, I've added a section below for that. --Interiot 00:53, 5 December 2005 (UTC)
Contents |
[edit] Equivalent commands in your favorite editor/operating system
[edit] Vim
- hit "p" to paste the clipboard into the editor
- type ":g!/(hist)/d" and hit enter, to remove all non-history lines
- type Control-G, and the total number of lines should be displayed at the bottom of your screen
[edit] MS-DOS
- find /c "(hist)" filename
[edit] Unix
- grep -c '(hist)' filename
[edit] Quick and dirty
You can do a count by copy pasting the contributions list into Microsoft Word, doing select all and formatting as a numbered list. Can probably do something similar in other word processors. the wub "?!" 00:01, 6 December 2005 (UTC)
- True, right now the "tool" is at a crude stage where all it can do is count edits by parsing through them and incrementing a variable. However, I plan to include statistics soon — i.e. breakdown by namespace, percent of minor edits, percent of edit summaries, etc. This is the basic framework for future versions. Thanks! Flcelloguy (A note?) 01:07, 6 December 2005 (UTC)
-
- In terms of breaking out the specific statistics, I'm doing that with my tool [1], but it turns out that it's difficult to do in a language-agnostic way... that is, it's hard to differentiate between the edit summaries "Category Talk:", "Kategorie Diskussion:", and "Please: stop reverting!" and automatically realize the latter is in the main namespace. But somebody on IRC mentioned that Kate's tool won't return for a couple more weeks :(, so I guess it's good to have some alternatives. --Interiot 03:17, 6 December 2005 (UTC)
- I'm working on a more sophisticated extension to the tool at User:Titoxd/Flcelloguy's Tool, which will be able to parse correctly the name of pages, namespaces, minor/major edits, edit summaries and recent edits from the HTML of the Special:Contributions page, with no need of cut-and-pasting. It's still on its early stages, though. Titoxd(?!? - did you read this?) 20:53, 8 December 2005 (UTC)
[edit] Python version
I find your code and its repetition of if statements, etc, to be very redundant and leave a lot for improvement in terms of size. You shouldn't need separate variable names for each count. But I couldn't be bothered working with Java data structures, so here is essentially the same in Python using standard input/output:
from sys import stdin from re import compile as compre from sets import Set CONTR_RE = compre('\(diff\) (m?) ?([^:]*:|.*)') namespaces = [ '', 'Talk', 'User', 'User talk', 'Category', 'Category talk', 'Image', 'Image talk', 'MediaWiki', 'MediaWiki talk', 'Template', 'Template talk', 'Wikipedia', 'Wikipedia talk' ] counts = {} for ns in namespaces: counts[('', ns)] = 0 counts[('m', ns)] = 0 ns_set = Set(namespaces) for line in stdin: match = CONTR_RE.search(line) (minor, ns) = match.groups() if ns[:-1] in ns_set: ns = ns[:-1] else: ns = '' counts[(minor, ns)] += 1 def print_row(title, major, minor, tot=None): if not tot: tot = str(major + minor) print '%s%s\t%s\t%s\t%s' % ( title, ' '*(16-len(title)), str(major), str(minor), tot ) print_row('Namespace', 'MAJ', 'MIN', 'TOT') print_row('---------', '---', '---', '---') counts[''] = 0 counts['m'] = 0 for ns in namespaces: ns_name = ns if not ns_name: ns_name = "main" print_row(ns_name, counts[('', ns)], counts[('m', ns)]) counts[''] += counts[('', ns)] counts['m'] += counts[('m', ns)] print_row('TOTAL', counts[''], counts['m'])