Forum


Replies: 5   Views: 1922
Placeholders are split across xml nodes
Topic closed:
Please note this is an old forum thread. Information in this post may be out-to-date and/or erroneous.
Every phpdocx version includes new features and improvements. Previously unsupported features may have been added to newer releases, or past issues may have been corrected.
We encourage you to download the current phpdocx version and check the Documentation available.

Posted by bkdl  · 31-08-2021 - 21:17

Hello,

Our word templates are user-generated which means that sometimes the formatting may be strange and unpredictable. We have a case where even a simple toggle of bold on a part of a placeholder will split the placeholder in the xml.

processTemplate doesn't seem to be able to repair these placeholders. In the xml below I selected a few characters from the placeholder text and toggled Bold on and off. You can see that this split the placeholder into a few runs (w:r).

What should we do in these scenarios? Are there any settings we can use to get phpdocx to recognize these as a placeholder and repair it? I suspect the issue is that the run properties (w:rPr) are not the same across each run, but if their styles are evaluated they would be a match.

 


<w:document mc:Ignorable="w14 wp14">
   <w:body>
      <w:p>
         <w:pPr>
            <w:pStyle w:val="Normal"/>
            <w:bidi w:val="0"/>
            <w:jc w:val="left"/>
            <w:rPr/>
         </w:pPr>
         <w:r>
            <w:rPr>
               <w:b w:val="false"/>
               <w:bCs w:val="false"/>
            </w:rPr>
            <w:t>{{USER</w:t>
         </w:r>
         <w:r>
            <w:rPr/>
            <w:t>_N</w:t>
         </w:r>
         <w:r>
            <w:rPr>
               <w:b w:val="false"/>
               <w:bCs w:val="false"/>
            </w:rPr>
            <w:t>A</w:t>
         </w:r>
         <w:r>
            <w:rPr/>
            <w:t>ME}}</w:t>
         </w:r>
      </w:p>
      <w:p>
         <w:pPr>
            <w:pStyle w:val="Normal"/>
            <w:bidi w:val="0"/>
            <w:jc w:val="left"/>
            <w:rPr/>
         </w:pPr>
         <w:r>
            <w:rPr/>
         </w:r>
      </w:p>
      <w:sectPr>
         <w:type w:val="nextPage"/>
         <w:pgSz w:w="12240" w:h="15840"/>
         <w:pgMar w:left="1134" w:right="1134" w:header="0" w:top="1134" w:footer="0" w:bottom="1134" w:gutter="0"/>
         <w:pgNumType w:fmt="decimal"/>
         <w:formProt w:val="false"/>
         <w:textDirection w:val="lrTb"/>
         <w:docGrid w:type="default" w:linePitch="100" w:charSpace="0"/>
      </w:sectPr>
   </w:body>
</w:document>

 

Posted by bkdl  · 01-09-2021 - 14:29

Thanks for the quick reply.

I should have mentioned that we have customized those fields already according to the docs. All variable replacement works, except when it is split across nodes as shown previously. Is there anything I'm missing with this regex? (We're using `#` as our block identifier.)

CreateDocxFromTemplate::$regExprVariableSymbols = '\{\{(?:#){0,1}(?:[A-Z0-9\s\-_])+\}\}';

Is it perhaps that we're allowing a space character in our variables?

Posted by bkdl  · 01-09-2021 - 15:29

I changed our regex to the following and it's working well now...

'\{\{[^}]+\}\}'

That's s till too permissive for us, but you put me on the right track to find what my previous regex was missing. maybe newlines? idk I'll keep exploring.

 

Thanks!

Posted by bkdl  · 01-09-2021 - 16:08

Ohhhhhhhhhh. This regex operates on the entire XML as a string for each portion of the document. So in the case I listed we're not matching because there are colons, slashes, and other characters we were excluding.

Considering this behavior I don't think we have much of an option to be specific in our variable name requirements within the delimiters, but that's probably ok.

Thanks again!