Help:WordToWiki
From Wikipedia, the free encyclopedia
Contents |
[edit] Stand-alone tools
Here are some tools that may be helpful in converting Microsoft Word files to wiki markup:
[edit] Two-stage conversion from Word to MediaWiki
The following methods both perform: Word -> HTML -> MediaWiki
.
[edit] Quick and Dirty
- Open your document in Word, and "save as" an HTML file.
- Open the HTML file in a text editor and copy the HTML source code to the clipboard.
- Paste the HTML source into the large text box labeled "HTML source" on the html2wiki page.
- Click the "Convert HTML to wiki markup" button.
- Select the text in the "MediaWiki markup" text box and copy it to the clipboard.
- Paste the text to a Wikipedia article.
[edit] Automated scripts
The conversion can also be done using a combination of two scripts and two software packages.
- The following two software packages must be installed:
- wvHtml Word to HTML converter - part of the "wvWare" word viewing library. (Note: wvHtml is deprecated and the site recommends using
AbiWord --to=html
instead. AbiWord can be obtained at abisource.com.) - HTML::WikiConverter - a perl module to convert HTML to wiki markup language.
- wvHtml Word to HTML converter - part of the "wvWare" word viewing library. (Note: wvHtml is deprecated and the site recommends using
- Write the bash script "doc2mw", and the perl script "html2mw", both shown below.
- Call doc2mw passing the word document as parameter. i.e.
> doc2mw my_word.doc
doc2mw: a bash script taking a single parameter, which calls wvHtml followed by html2mw.
#!/bin/bash # doc2mw - Word to MediaWiki converter FILE=$1 TMP="$$-${FILE}" if [ -x "./html2mw" ]; then HTML2MW='./html2mw' else HTML2MW='html2mw' fi wvHtml --targetdir=/tmp "${FILE}" "${TMP}" # but see also AbiWord: http://www.abisource.com/help/en-US/howto/howtoexporthtml.html # Remove extra divs perl -pi -e "s/\<div[^\>]+.\>//gi;" "/tmp/${TMP}" ${HTML2MW} "/tmp/${TMP}" rm "/tmp/${TMP}"
html2mw: a perl script called by doc2mw, which uses HTML::WikiConverter to convert html -> mediawiki.
#!/usr/bin/perl # html2mw - HTML to MediaWiki converter use HTML::WikiConverter; my $b; while (<>) { $b .= $_; } my $w = new HTML::WikiConverter( dialect => 'MediaWiki' ); my $p = $w->html2wiki($b); # Substitutions to get rid of nasty things we don't need $p =~ s/<br \/>//g; $p =~ s/\ \;//g; print $p;
Disclaimer: These scripts are probably not the best way to do this, only a possible way to do this. Please feel free to improve them.
[edit] OpenOffice 2.3
OpenOffice, version 2.3 can save Word documents directly to MediaWiki format. This is an excellent alternative, since OpenOffice is a free, open-source replacement for Microsoft Word.
- Open the Word document in OpenOffice 2.3.
- Go to File / Export.
- Under File format choose MediaWiki (.txt).
- Click Save.
- Open the new file in a text editor and copy the contents to the clipboard.
- Paste the text to a Wikipedia article.