Forum


Replies: 7   Views: 4113
Extra space on the beginning of every paragraph on html to doc
Topic closed:
Please note this is an old forum thread. Information in this post may be out-to-date and/or erroneous.
Every phpdocx version includes new features and improvements. Previously unsupported features may have been added to newer releases, or past issues may have been corrected.
We encourage you to download the current phpdocx version and check the Documentation available.

Posted by ddg  · 19-11-2018 - 18:21

I have simple templates set up, with some layout set (header, footer, margins) and one single variable to replace. However, when I replace it with HTML with the call:

$docx->replaceVariableByHTML('REPORT', 'block', $css . $html, [
            'strictWordStyles' => false,
            'parseDivsAsPs' => true
        ]);

… the generated file has a space on the beginning of every single paragraph. I set the paragraphs to have no left margin, the HTML renders like I want it to render on the browser (with no margin nor text indentation), but the generated files invariably have that space. I tried replacing <p> tags with <div>, removing potential tabs and nl characters, all for naught.  Tidy is enabled; I've checked with

extension_loaded('tidy')

… right before the file generating code, and the function returns true.

PHP 7.2.6, phpdocx 7.5.
Any help fixing this issue would be greatly appreciated, and would lead to acquiring further licenses to deploy this software to other servers. Thank you in advance!

Posted by admin  · 19-11-2018 - 18:52

Hello,

Please send to contact[at]phpdocx.com the username or e-mail of the user that purchased the license.

Also, please attach the most simple template and script that illustrates your issue. We need to run it, so the script shouldn't call external resources (such as databases or web services).

We only have seen that extra blank spaces when Tidy is missing or not enabled, so we need to check your template and script to find the source of your issue.

Regards.

Posted by admin  · 20-11-2018 - 11:25

Thanks for sending the files. The problem is related to some configuration from PHP Tidy, that is doing a break for each word; we have tested your script in our test servers and the blank spaces don't appear.
 
phpdocx 8 added a new option to clean these breaks: removeLineBreaks ; but you can add support to phpdocx 7.5 easily.
 
Please edit the classes/CreateDocx.inc file, and in the embedHTML method, replaces the following lines:
 
if ($class == 'WordFragment') {
  $this->wordML .= (string) $sFinalDocX[0];
} else {
  $this->_wordDocumentC .= (string) $sFinalDocX[0];
}

with:

if ($class == 'WordFragment') {
  $this->wordML .= (string) str_replace(array("\n", "\r\n"), '', $sFinalDocX[0]);
} else {
  $this->_wordDocumentC .= (string) str_replace(array("\n", "\r\n"), '', $sFinalDocX[0]);
}

After this change the blank space in each line will disappear.

Regards.

Posted by miklosgeyer  · 02-01-2019 - 10:58

Could you please name the configuration option of tidy, that causes a break for each word?

I have the same issue with phpdox 8.5. When I set removeLineBreaks to true, it removes all spaces in the text.

Posted by admin  · 02-01-2019 - 13:22

Hello,

That blank spaces may be due to the vertical-space option from Tidy (https://stackoverflow.com/questions/2491657/how-do-i-get-html-tidy-to-not-put-newline-before-closing-tags), but some versions of Tidy may cause this issue.

The next relase of phpdocx will include a new option to clean these breaks that come from some Tidy versions/configurations (line breaks after closing tags). We recommend you to edit the classes/CreateDocx.php file, and change the following lines (in the embedHTML method):

if ($class == 'WordFragment') {
    if (isset($options['removeLineBreaks']) && $options['removeLineBreaks'] == true) {
        $this->wordML .= (string) str_replace(array("\n", "\r\n"), '', $sFinalDocX[0]);
    } else {
        $this->wordML .= (string) $sFinalDocX[0];
    }
} else {
    if (isset($options['removeLineBreaks']) && $options['removeLineBreaks'] == true) {
        $this->_wordDocumentC .= (string) str_replace(array("\n", "\r\n"), '', $sFinalDocX[0]);
    } else {
        $this->_wordDocumentC .= (string) $sFinalDocX[0];
    }
}

with:

if ($class == 'WordFragment') {
    if (isset($options['removeLineBreaks']) && $options['removeLineBreaks'] == true) {
        $this->wordML .= (string) str_replace(array(">\n", ">\r\n"), '>', $sFinalDocX[0]);
    } else {
        $this->wordML .= (string) $sFinalDocX[0];
    }
} else {
    if (isset($options['removeLineBreaks']) && $options['removeLineBreaks'] == true) {
        $this->_wordDocumentC .= (string) str_replace(array(">\n", ">\r\n"), '>', $sFinalDocX[0]);
    } else {
        $this->_wordDocumentC .= (string) $sFinalDocX[0];
    }
}

And set removeLineBreaks as true. This change removes the line breaks after start tags and should fix your issue. If you are not using the classic package but the namespaces one, please keep the namespaces in the same code.

Regards.

Posted by queejie  · 23-01-2019 - 20:30

Thank you.  I had the same problem, and replacing the embedHTML() code worked for me as well.  I don't think the tidy option makes a difference.

Posted by queejie  · 30-01-2019 - 20:22

Deleted by queejie · 30-01-2019 - 20:58

Posted by admin  · 01-03-2019 - 20:42

Hello,

After doing a lot of tests, we have checked that some Tidy installations don't respect the wrap option correctly.

Tidy configuration details that setting wrap as 0 disables any line wrapping (http://tidy.sourceforge.net/docs/quickref.html#wrap), but some versions/setups don't respect it. To solve this issue, that happens only in a few weird cases, the easiest solutions are: removing the wrap option or setting the wrap value to a big number.

The Tidy configuration can be found in the DOMPDF_lib.php file in the _load_html method. This is the default setting:

$tidy = tidy_parse_string($str, array('output-xhtml' => true, 'markup' => false, 'wrap' => 0, 'wrap-asp' => false, 'wrap-jste' => false, 'wrap-php' => false, 'wrap-sections' => false), 'utf8');

where wrap can be removed:

$tidy = tidy_parse_string($str, array('output-xhtml' => true, 'markup' => false, 'wrap-asp' => false, 'wrap-jste' => false, 'wrap-php' => false, 'wrap-sections

or set with a high value:

$tidy = tidy_parse_string($str, array('output-xhtml' => true, 'markup' => false, 'wrap' => 1000000, 'wrap-asp' => false, 'wrap-jste' => false, 'wrap-php' => false, 'wrap-sections

Using one of these solutions the extra space issue is solved.

phpdocx 9 added new options to set Tidy wrap settings.

Regards.