I need a way to convert from doc and docx to xml. Can I use phpdocx to convert doc or docx to xml without any loss?
I need a way to convert from doc and docx to xml. Can I use phpdocx to convert doc or docx to xml without any loss?
Hello,
Please note that DOCX documents include XML files and also other optional files such as images, XLSX, binary files... If you extract the DOCX file (a DOCX is a ZIP file), you can view the included files. A DOCX is not a single XML file but many XML contents that follow OOXML standard (https://en.wikipedia.org/wiki/Office_Open_XML).
phpdocx includes methods to extract information and XML contents from DOCX documents. For example getDocxPathQueryInfo, getWordStyles, indexer... You can also transform DOCX to HTML and other document formats.
Regards.
I asked the wrong question. I am dealing with Word documents from the 2003 version (doc). I want to convert doc documents to XML without any loss. Is there no way to directly map doc to XML instead of converting doc to docx and then extracting the XML?
Hello,
phpdocx doesn't include a direct conversion from DOC to XML. You need to transform DOC to DOCX using transformDocument to be able to use phpdocx methods to get XML contents and information.
Regards.