Archive for October, 2011

Vim, Cygwin, docx2txt, catdoc and antiword view Microsoft Word Documents as Text

October 10th, 2011 No comments

I wanted to quickly compare two versions of a Word document and realized I didn’t know how to do it. A quick Google revealed that catdoc was available with Cygwin (and I guess elsewhere) .

> catdoc report1a.doc > report1a.txt
> catdoc report1b.doc > report1b.txt
> diff report1*.txt

Alternatively use antiword
place the following 4 lines in your .vimrc file

” use antiword to allow VIM to view the text of a Word Document
autocmd BufReadPre *.doc set ro
autocmd BufReadPre *.doc set hlsearch!
autocmd BufReadPost *.doc %!antiword “%”

> vim -d report1a.doc report1b.doc

Remember neither catdoc or antiword allow you to update your word documents.

catdoc also works with PowerPoint etc.

Just found out that these don’t work with the XML based docs eg report.docx for these you need Perl based docx2txt

Categories: Cygwin, vim, windows Tips Tags: