Help:WordToWiki

From Wikipedia, the free encyclopedia

Contents

[edit] Stand-alone tools

Here are some tools that may be helpful in converting Microsoft Word files to wiki markup:

[edit] Two-stage conversion from Word to MediaWiki

The following methods both perform: Word -> HTML -> MediaWiki.

[edit] Quick and Dirty

  1. Open your document in Word, and "save as" an HTML file.
  2. Open the HTML file in a text editor and copy the HTML source code to the clipboard.
  3. Paste the HTML source into the large text box labeled "HTML source" on the html2wiki page.
  4. Click the "Convert HTML to wiki markup" button.
  5. Select the text in the "MediaWiki markup" text box and copy it to the clipboard.
  6. Paste the text to a Wikipedia article.

[edit] Automated scripts

The conversion can also be done using a combination of two scripts and two software packages.

  1. The following two software packages must be installed:
  2. Write the bash script "doc2mw", and the perl script "html2mw", both shown below.
  3. Call doc2mw passing the word document as parameter. i.e.
> doc2mw my_word.doc


doc2mw: a bash script taking a single parameter, which calls wvHtml followed by html2mw.

 #!/bin/bash
 #       doc2mw - Word to MediaWiki converter
 
 FILE=$1
 TMP="$$-${FILE}"
 
 if [ -x "./html2mw" ]; then
         HTML2MW='./html2mw'
 else
         HTML2MW='html2mw'
 fi
 
 wvHtml --targetdir=/tmp "${FILE}" "${TMP}" 
 # but see also AbiWord: http://www.abisource.com/help/en-US/howto/howtoexporthtml.html
 
 # Remove extra divs
 perl -pi -e "s/\<div[^\>]+.\>//gi;" "/tmp/${TMP}"
 
 ${HTML2MW} "/tmp/${TMP}"
 rm "/tmp/${TMP}"

html2mw: a perl script called by doc2mw, which uses HTML::WikiConverter to convert html -> mediawiki.

 #!/usr/bin/perl
 #       html2mw - HTML to MediaWiki converter
 
 use HTML::WikiConverter;
 
 my $b;
 while (<>) { $b .= $_; }
 
 my $w = new HTML::WikiConverter( dialect => 'MediaWiki' );
 
 my $p = $w->html2wiki($b);
 
 # Substitutions to get rid of nasty things we don't need
 $p =~ s/<br \/>//g;
 $p =~ s/\&nbsp\;//g;
 print $p;

Disclaimer: These scripts are probably not the best way to do this, only a possible way to do this. Please feel free to improve them.

[edit] OpenOffice 2.3

OpenOffice, version 2.3 can save Word documents directly to MediaWiki format. This is an excellent alternative, since OpenOffice is a free, open-source replacement for Microsoft Word.

  1. Open the Word document in OpenOffice 2.3.
  2. Go to File / Export.
  3. Under File format choose MediaWiki (.txt).
  4. Click Save.
  5. Open the new file in a text editor and copy the contents to the clipboard.
  6. Paste the text to a Wikipedia article.