Forum


Replies: 9   Views: 1819
How to use the streammode?
Topic closed:
Please note this is an old forum thread. Information in this post may be out-to-date and/or erroneous.
Every phpdocx version includes new features and improvements. Previously unsupported features may have been added to newer releases, or past issues may have been corrected.
We encourage you to download the current phpdocx version and check the Documentation available.

Posted by pwctechnicalsecurity  · 01-04-2021 - 20:11

 

I am doing a procedure, where (simply put), I follow the example as outlined in phpdocx/examples/DocxUtilities/mergeDocx/sample_2.php.

Roughly:

CreateDocx::$returnDocxStructure = true;
$docx = new CreateDocxFromTemplate($manifest->canvas->word_file_path);
Simple operations (generally adding sequences through EmbedHTML). The DocX is pushed into an array.
Above two steps are done two more times.
// Compose Word document from queue
$first = $queue->shift();

$merge = new MultiMerge();

CreateDocx::$streamMode = true;

return $merge->mergeDocx($first, $queue->toArray(), 'docs/example.docx', array('mergeType' => 0));

In case I set CreateDocx::$streamMode = true; I get the following issues when opening Word. In case I don't set streamMode, there are no issues. Any suggestions?

Word found unreadable content in "docs_example (2).docx". Do you want to recover the contents of this document? If you trust the source of this document, click Yes.

When clicking Yes, I get another dialog box that shows the repairs:

Word detected and repaired the following errors. To view each repair, select it in the list, then click Go To. Save the document to make the repairs permanent.

A listbox that shows "Styles1"

If I want to view the repair, by clicking Go To, it states: This bookmark does not exist.

Posted by pwctechnicalsecurity  · 01-04-2021 - 22:24

This is really annoying.. so I tried the following:

public function debug(Request $request)
{
    \Phpdocx\Create\CreateDocx::$returnDocxStructure = true;
    $queue = array();
    $docx = new \Phpdocx\Create\CreateDocxFromTemplate('/path_to_docx');
    $queue[] = $docx->createDocx();
    \Phpdocx\Create\CreateDocx::$returnDocxStructure = false;
    \Phpdocx\Create\CreateDocx::$streamMode = true;
    $merge = new \Phpdocx\Utilities\MultiMerge();
    $merge->mergeDocx($queue[0], array(), 'example.docx', array());
}

This still gives me the abovementioned issue. I don't think it gets simpler than this.

If I alter the sample2.php a bit to include two times CreateDocxFromTemplate of our template and include streammode = true after Phpdocx\Create\CreateDocx::$returnDocxStructure = false; it works fine.

Could it be something of Phpdocx (namespace version) in combination with Laravel?

Posted by pwctechnicalsecurity  · 02-04-2021 - 08:11

public function debug(Request $request)
{

    \Phpdocx\Create\CreateDocx::$returnDocxStructure = true;
    $queue = array();
    $docx = new \Phpdocx\Create\CreateDocxFromTemplate('/patth_to/vendor/phpdocx/examples/files/headings.docx');
    $queue[] = $docx->createDocx();
    \Phpdocx\Create\CreateDocx::$returnDocxStructure = false;
    \Phpdocx\Create\CreateDocx::$streamMode = true;
    $merge = new \Phpdocx\Utilities\MultiMerge();
    $merge->mergeDocx($queue[0], array(), 'example.docx', array());

Using your own DocX as template, it still results in the first issue (unreadable content recovery), not in the second "Show Repairs" dialog box ("Styles 1"-comment).

Posted by admin  · 02-04-2021 - 09:24

Hello,

We have run the following script (that is the same you have posted):

<?php

require_once '../../Classes/Phpdocx/Create/CreateDocx.php';

Phpdocx\Create\CreateDocx::$returnDocxStructure = true;
$queue = array();
$docx = new Phpdocx\Create\CreateDocxFromTemplate('../../examples/files/headings.docx');
$queue[] = $docx->createDocx();
Phpdocx\Create\CreateDocx::$returnDocxStructure = false;
Phpdocx\Create\CreateDocx::$streamMode = true;
$merge = new Phpdocx\Utilities\MultiMerge();
$merge->mergeDocx($queue[0], array(), 'example.docx', array());

using PHP CLI mode and redirecting the stream to a file:

$ php test.php > output.docx

and the DOCX opens with all versions of MS Word without issue.

Please run the same code using PHP CLI mode, so you can check it and find if the problem comes from then integration with Laravel. If you are adding the stream to Laravel, maybe the code you are using is not returning the correct response to use a stream (https://laravel.com/docs/8.x/responses#streamed-downloads) and some extra content is being added to the DOCX (https://www.phpdocx.com/documentation/cookbook/corrupted-docx)?

Also note that mergeDocx and createDocx methods return a DOCXStructure that can be reused. Doing:

return $merge->mergeDocx(...

doesn't return a stream but a DOCXStructure. When you set CreateDocx::$streamMode as true, then the DOCX is generated as a stream, that you needs to handle (not using a return sentence).

Regards.

Posted by pwctechnicalsecurity  · 02-04-2021 - 14:44

Hi, thanks for all the suggestions.

Do you have a working example of a Laravel streamed download for a merged file?

Would it be something like this?

return response()->streamDownload(function () use ($merge) {
    echo $merge->mergeDocx($queue[0], array(), 'example.docx', array());
}, 'filenamexxx');

It must has something to do with 

  • The DOCX file is not valid and it is not possible to open it. This is the most common issue about corrupted documents, and it means that the web server or PHP are adding extra content to the file at the beginning or the end, e.g when it is downloaded. It is possible to see those additional contents by opening the DOCX with a hex-editor. This way you can trace their origin and thus prevent them to be added.

Especially, since I am basically not adding any content to my DocX in the example. Besides the working Laravel example, are there any reknown PHP / Web Server things that generally cause these issues? E.g. are there any examples to the statement that help me pinpoint the issue?

Posted by admin  · 02-04-2021 - 15:26

Hello,

Sorry but we don't have a sample about using the stream mode with Laravel, as each version of Laravel may vary from the method to be used to return a stream response (the same applies to other frameworks/CMS: Symfony, Yii, Drupal, WordPress...).

phpdocx generates a standard stream output when using CreateDocx::$streamMode, so the CMS/framework/custom code can handle it, but the exact internal method/object/class depends on each CMS/framework/code version. We recommend you to check the documentation for the Laravel version you are using, for example it seems that some people use a stream method from Laravel Response object: https://stackoverflow.com/questions/58989925/laravel-response-stream-download-returns-empty-file. We also recommend you opening the DOCX output with an HEX viewer so you can check what extra content is being added to the DOCX by the framework you are using, maybe some extra information from the Response object.

For example, on https://www.phpdocx.com/en/forum/default/topic/1463 you can see a very similar case working with Symfony and download the DOCX. The code doesn't use the stream mode from phpdocx, but using a not correct Response with Symfony added extra contents to the DOCX output.

Regards.

Posted by pwctechnicalsecurity  · 02-04-2021 - 16:22

Hi,

Thanks for the suggestion; I think I can manage myself, if only the online documentation would be more specific (or available in the first place). I think the community would greatly benefit from a simple working example with popular frameworks - especially, considering Laravel is using Symfony (Symfony\Component\HttpFoundation\StreamedResponse) and more and more frameworks are using Symfony components. Providing a public accessible Symfony example would probably address a lot of situations already.

Let me phrase it differently (as probably already 90% of the code is probably there already):

\Phpdocx\Create\CreateDocx::$returnDocxStructure = true;
$queue = array();
$docx = new \Phpdocx\Create\CreateDocxFromTemplate('/path_to_.docx');
$queue[] = $docx->createDocx();

$merge = new \Phpdocx\Utilities\MultiMerge();
$result = $merge->mergeDocx($queue[0], array(), 'example.docx', array());

return response()->streamDownload(function () use ($result) {

    echo $result;

}, 'document.docx');

I figure that, having the MultiMerge instance return DocXStructure (instead of having it created on the filesystem when $returnDocxStructure = false) and taking ownership of the streaming mode (as not setting streamMode to true, because that actually intervenes with the regular Controller / Response behavior).

Hence, while the above example is not correct, it should come very close. Let me phrase it like this: what output do I need to have from PhpDocx (DocX, Docx Structure?) in order to stream the result (without touching the filesystem)?

Posted by admin  · 02-04-2021 - 17:04

Hello,

The documentation (https://www.phpdocx.com/documentation/cookbook/) details how to integrate phpdocx with some frameworks and CMS: Symfony, CakePHP, Drupal, Laravel, Yii, and others to use the library.

About what output do I need to have from PhpDocx (DocX, Docx Structure?) in order to stream the result, it's done calling:

CreateDocx::$streamMode = true;

as it's detailed on the API documentation page: https://www.phpdocx.com/api-documentation/performance/zip-stream-docx-with-PHP. and the code from a previous reply details:

<?php

require_once 'Classes/Phpdocx/Create/CreateDocx.php';

Phpdocx\Create\CreateDocx::$returnDocxStructure = true;
$queue = array();
$docx = new Phpdocx\Create\CreateDocxFromTemplate('../../examples/files/headings.docx');
$queue[] = $docx->createDocx();
Phpdocx\Create\CreateDocx::$returnDocxStructure = false;
Phpdocx\Create\CreateDocx::$streamMode = true;
$merge = new Phpdocx\Utilities\MultiMerge();
$merge->mergeDocx($queue[0], array(), 'example.docx', array());

You can't return (or echo) the mergeDocx output as a DOCX stream doing:

$output = $merge->mergeDocx($queue[0], array(), 'example.docx', array());
return $output;

because it returns a DOCXStructure object, that is created using an internal phpdocx class (DOCXStructure). A DOCXStructure object can use the saveDocx method to save/stream the content:

// CreateDocx::$streamMode = true can also be used to stream the content instead of save the document using a DOCXStructure object

$docxStructure->saveDocx('document');

The stream mode and in-memory DOCX features included in phpdocx and how to use them are explained on the following documentation pages:

https://www.phpdocx.com/documentation/cookbook/improve-phpdocx-performance

https://www.phpdocx.com/documentation/cookbook/in-memory-docx-documents

CreateDocx::$streamMode = true generates a stream, so if you need to get it to be used with other workflow (such as returning the stream as a Response), you may need to capture it to stream the content again. This can be done using PHP stream functions, for example stream_copy_to_stream (https://www.php.net/manual/en/function.stream-copy-to-stream.php), but this is not part of the workflow of phpdocx, that just generates the stream, but from external code that may vary based on the framework/CMS/custom code. On https://symfonycasts.com/screencast/symfony-uploads/file-streaming you can find a guide about using streamings (the code uses stream_copy_to_stream to illustrate the sample) with Symfony 5 that may be helpful for your case.

Regards.

Posted by pwctechnicalsecurity  · 03-04-2021 - 08:14

Setting up a stream seems like an inferior option in an already performance intensive operation.

For people looking into this issue, a possible (yet inferior solution) is:

\Phpdocx\Create\CreateDocx::$returnDocxStructure = true;
$queue = array();
$docx = new \Phpdocx\Create\CreateDocxFromTemplate('/path_to.docx');
$queue[] = $docx->createDocx();
\Phpdocx\Create\CreateDocx::$returnDocxStructure = false;
\Phpdocx\Create\CreateDocx::$streamMode = true;
$merge = new \Phpdocx\Utilities\MultiMerge();
$merge->mergeDocx($queue[0], array(), 'example.docx', array());

return response()->noContent();

IMHO this is inferior, because the stream is already started by $merge->mergeDocx($queue[0], array(), 'example.docx', array()). The last return statement is just there to ensure that no additional content is captured in PHPDocx's stream. This does the trick, but seems like a bit of a dirty solution to me. 

Why not output your PhpDocx stream along the lines of Symfony\Component\HttpFoundation\StreamedResponse? As argued before, this Symfony component is becoming a widely accepted component in other frameworks and CMS as well.

In any case, hope this helps some people looking into the same issue. Meanwhile, I am looking forward to the updated documentation. Could you please drop a line here once it becomes available? 

Posted by admin  · 03-04-2021 - 10:57

Hello,

phpdocx doesn't use/include StreamedResponse an other similar dependencies for many reasons: the library is compatible with all PHP versions from PHP 5.2.11, many CMS and frameworks don't use it but their own classes/methods to work with streams (WordPress, CodeIgniter...), it's a not needed library as phpdocx generates stream in a generic way that doesn't require using that component, there's a classic package that doesn't use PHP namespaces and other reasons.

Regards.