Convert .doc to Open editable format

Daniel Bareiro's picture

Forums: 

Hi all!

Currently I'm making the Final University Project and I've found that
the templates provided by teachers for the reports are in Microsoft Word
format. It is sad to note that encourages the use of a closed-format for
something that should mean a contribution to humanity...

When trying to open these documents with Ooo 2.x on Debian, headers are
displayed incorrectly or there is another problem in how to interpret
any other details of the layout of the document.

I was testing with tools on the web, but I noticed that with several of
them are seen the same mistakes when converting from DOC to PDF or ODT,
which makes me think that maybe to open the Word document they are using
the same engine type.

Only this [1] seems to convert correctly to PDF. Given this half, I was
missing the other half to convert a PDF to ODT, but unfortunately so far
I did not find something that meets my expectations. I tested this [2]
extension with OOo 3.2 on Debian, but it seems that this does not create
a normal text editable document, but some type of picture with text
boxes.

Are there some tools that can be recommend me to do this kind of
conversion (DOC -> ODT) with or without intermediate steps?

Thanks in advance for your replies.

Regards,
Daniel

[1] http://docupub.com/pdfconvert/
[2] http://extensions.services.openoffice.org/project/pdfimport

Convert .doc to Open editable format

Mihira Fernando's picture

On 05/30/2011 02:28 AM, Daniel Bareiro wrote:
> Hi all!
>
> Currently I'm making the Final University Project and I've found that
> the templates provided by teachers for the reports are in Microsoft Word
> format. It is sad to note that encourages the use of a closed-format for
> something that should mean a contribution to humanity...
>
> When trying to open these documents with Ooo 2.x on Debian, headers are
> displayed incorrectly or there is another problem in how to interpret
> any other details of the layout of the document.
>
> I was testing with tools on the web, but I noticed that with several of
> them are seen the same mistakes when converting from DOC to PDF or ODT,
> which makes me think that maybe to open the Word document they are using
> the same engine type.
>
> Only this [1] seems to convert correctly to PDF. Given this half, I was
> missing the other half to convert a PDF to ODT, but unfortunately so far
> I did not find something that meets my expectations. I tested this [2]
> extension with OOo 3.2 on Debian, but it seems that this does not create
> a normal text editable document, but some type of picture with text
> boxes.
>
> Are there some tools that can be recommend me to do this kind of
> conversion (DOC -> ODT) with or without intermediate steps?
>
>
> Thanks in advance for your replies.
>
> Regards,
> Daniel
>
> [1] http://docupub.com/pdfconvert/
> [2] http://extensions.services.openoffice.org/project/pdfimport
have you tried opening it with google docs and saving as odt ?

Convert .doc to Open editable format

Daniel Bareiro's picture

On Monday, 30 May 2011 02:32:32 +0530,
Mihira Fernando wrote:

>> Currently I'm making the Final University Project and I've found that
>> the templates provided by teachers for the reports are in Microsoft Word
>> format. It is sad to note that encourages the use of a closed-format for
>> something that should mean a contribution to humanity...
>>
>> When trying to open these documents with Ooo 2.x on Debian, headers are
>> displayed incorrectly or there is another problem in how to interpret
>> any other details of the layout of the document.
>>
>> I was testing with tools on the web, but I noticed that with several of
>> them are seen the same mistakes when converting from DOC to PDF or ODT,
>> which makes me think that maybe to open the Word document they are using
>> the same engine type.
>>
>> Only this [1] seems to convert correctly to PDF. Given this half, I was
>> missing the other half to convert a PDF to ODT, but unfortunately so far
>> I did not find something that meets my expectations. I tested this [2]
>> extension with OOo 3.2 on Debian, but it seems that this does not create
>> a normal text editable document, but some type of picture with text
>> boxes.
>>
>> Are there some tools that can be recommend me to do this kind of
>> conversion (DOC -> ODT) with or without intermediate steps?

> have you tried opening it with google docs and saving as odt ?

Yes, we tried to upload the Word document to Google Docs, but the styles
are displayed broken too.

Thanks for your reply.

Regards,
Daniel

Convert .doc to Open editable format

Daniel Bareiro's picture

Hi Leonardo. I cc'ed the reply to the list.

On Sonday, 29 May 2011 21:17:35 +0000,
Leonardo Ruoso wrote:

> > Currently I'm making the Final University Project and I've found
> > that the templates provided by teachers for the reports are in
> > Microsoft Word format. It is sad to note that encourages the use of
> > a closed-format for something that should mean a contribution to
> > humanity...
> >
> > When trying to open these documents with Ooo 2.x on Debian, headers
> > are displayed incorrectly or there is another problem in how to
> > interpret any other details of the layout of the document.
> >
> > I was testing with tools on the web, but I noticed that with several
> > of them are seen the same mistakes when converting from DOC to PDF
> > or ODT, which makes me think that maybe to open the Word document
> > they are using the same engine type.
> >
> > Only this [1] seems to convert correctly to PDF. Given this half, I
> > was missing the other half to convert a PDF to ODT, but
> > unfortunately so far I did not find something that meets my
> > expectations. I tested this [2] extension with OOo 3.2 on Debian,
> > but it seems that this does not create a normal text editable
> > document, but some type of picture with text boxes.
> >
> > Are there some tools that can be recommend me to do this kind of
> > conversion (DOC -> ODT) with or without intermediate steps?

> Why are you using 2.x?

I'm using Ooo 2.4.1 because on this PC I have Debian Lenny 5.0.8.

> Try LibreOffice 3.x packages. Is your system up to date?

As I mentioned in another post of this thread, I did some tests with Ooo
3.2.1 on Debian Squeeze 6.0.1. Has any advantage using LibreOffice 3.x
over OpenOffice.org 3.x? Can LibreOffice save a PDF as ODT?

> Neither Microsoft Word or Open/Libre Office are the best tools for
> academic publishing.
>
> Lyx/Tex + Bibtex are just perfect and you probably find templates for
> the standard your university adopts.

Yes, I know. If I were alone and I have to do a presentation for
college, I might have opted for this option. But when you have to work
with people who never used [La]TeX, it is difficult to change their
mental model and their paradigm from WYSIWYG to WYSIWYM. Then I prefer
to save time by avoiding them having to deal with something new with
steep learning curve, and use this time to develop the software and
documents using WYSIWYG.

Thanks for your reply.

Regards,
Daniel

Convert .doc to Open editable format

William Hopkins's picture

On 05/29/11 at 06:15pm, Daniel Bareiro wrote:
> On Monday, 30 May 2011 02:32:32 +0530,
> Mihira Fernando wrote:
>
> >> Currently I'm making the Final University Project and I've found that
> >> the templates provided by teachers for the reports are in Microsoft Word
> >> format. It is sad to note that encourages the use of a closed-format for
> >> something that should mean a contribution to humanity...
[snip]
> >>
> >> Are there some tools that can be recommend me to do this kind of
> >> conversion (DOC -> ODT) with or without intermediate steps?
>
> > have you tried opening it with google docs and saving as odt ?
>
> Yes, we tried to upload the Word document to Google Docs, but the styles
> are displayed broken too.
>

If the styles are not translating, why not create a new document with the style information and copy/paste the content?
If that won't work, and formatting is crucial, I recommend you export to a different format from Word itself. After all, it is a proprietary format and all other programs which read it are based on reverse-engineering it. Go to the source, and save in something more appropriate. Perhaps use a print-to-PDF option.

But I think my first suggestion is your best bet overall. Then you can author in open formats from the start.

Convert .doc to Open editable format

Mark Grieveson's picture

> Can LibreOffice save a PDF as ODT?

I don't know about LibreOffice. However, there's a couple of
command-line programs that may be worth looking into. One is unoconv,
and the other is wkhtmltopdf.

Mark

Convert .doc to Open editable format

Camaleón's picture

On Sun, 29 May 2011 17:58:05 -0300, Daniel Bareiro wrote:

> Currently I'm making the Final University Project and I've found that
> the templates provided by teachers for the reports are in Microsoft Word
> format. It is sad to note that encourages the use of a closed-format for
> something that should mean a contribution to humanity...
>
> When trying to open these documents with Ooo 2.x on Debian, headers are
> displayed incorrectly or there is another problem in how to interpret
> any other details of the layout of the document.

(...)

Find a computer with MS Word on it and save/export the document from
there to another file format. With some DOC files (mostly those with
tables and/or images or complex layouts), I've not found a better way to
get a readable file ;-(

Greetings,