加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 综合聚焦 > 资源网站 > 资源 > 正文

PHPWord的HTML阅读器不适用于表格?

发布时间:2020-12-14 23:43:22 所属栏目:资源 来源:网络整理
导读:当我使用 Html阅读器将我的html转换为docx时,读者被切断了我的桌子. PHP示例: $reader = IOFactory::createReader('HTML');$phpWord = $reader-load($this-getReportDir() . '/' . $fileName);$writer = IOFactory::createWriter($phpWord);$writer-save($t
当我使用 Html阅读器将我的html转换为docx时,读者被切断了我的桌子.

PHP示例:

$reader = IOFactory::createReader('HTML');
$phpWord = $reader->load($this->getReportDir() . '/' . $fileName);
$writer = IOFactory::createWriter($phpWord);
$writer->save($this->getReportDir() . '/' . $fileName);

表格示例:

<table>
    <tr>
        <td>№ п/п</td>
        <td>Общие показатели результатов прохождения проверочных листов</td>
        <td>Количество пройденных проверок</td>
        <td>% от общего количества пройденных проверок</td>
    </tr>
</table>

解决方法

PHPWord的当前HTML类非常有限.您遇到的问题是一个已知问题(参见 https://github.com/PHPOffice/PHPWord/issues/324).

我正在一个需要一些HTML表来进行文档转换的项目.所以,我的工作有点改进HTML类.它很少测试,我只是测试DOC转换.

我的版本能够转换以下HTML:

<table style="width: 50%; border: 6px #0000FF solid;">
    <thead>
        <tr style="background-color: #FF0000; text-align: center; color: #FFFFFF; font-weight: bold; ">
             <th>a</th>
             <th>b</th>
             <th>c</th>
        </tr>
    </thead>
    <tbody>
        <tr><td>1</td><td colspan="2">2</td></tr>
        <tr><td>4</td><td>5</td><td>6</td></tr>
    </tbody>
</table>

生成以下DOC表:

它使用PHPWord版本0.13:

<?php
/**
 * This file is part of PHPWord - A pure PHP library for reading and writing
 * word processing documents.
 *
 * PHPWord is free software distributed under the terms of the GNU Lesser
 * General Public License version 3 as published by the Free Software Foundation.
 *
 * For the full copyright and license information,please read the LICENSE
 * file that was distributed with this source code. For the full list of
 * contributors,visit https://github.com/PHPOffice/PHPWord/contributors.
 *
 * @link        https://github.com/PHPOffice/PHPWord
 * @copyright   2010-2016 PHPWord contributors
 * @license     http://www.gnu.org/licenses/lgpl.txt LGPL version 3
 */

namespace PhpOfficePhpWordShared;

use PhpOfficePhpWordElementAbstractContainer;
use PhpOfficePhpWordElementTable;
use PhpOfficePhpWordElementRow;

/**
 * Common Html functions
 *
 * @SuppressWarnings(PHPMD.UnusedPrivateMethod) For readWPNode
 */
class Html
{
    //public static $phpWord=null;

    /**
    *  Hold styles from parent elements,*  allowing child elements inherit attributes.
    *  So if you whant your table row have bold font
    *  you can do:
    *     <tr style="font-weight: bold; ">
    *  instead of
    *     <tr>
    *       <td>
    *           <p style="font-weight: bold;">
    *       ...
    *
    *  Before DOM element children are processed,*  the parent DOM element styles are added to the stack.
    *  The styles for each child element is composed by
    *  its styles plus the parent styles.
    */
    public static $stylesStack=null;

    /**
     * Add HTML parts.
     *
     * Note: $stylesheet parameter is removed to avoid PHPMD error for unused parameter
     *
     * @param PhpOfficePhpWordElementAbstractContainer $element Where the parts need to be added
     * @param string $html The code to parse
     * @param bool $fullHTML If it's a full HTML,no need to add 'body' tag
     * @return void
     */
    public static function addHtml($element,$html,$fullHTML = false)
    {
        /*
         * @todo parse $stylesheet for default styles.  Should result in an array based on id,class and element,* which could be applied when such an element occurs in the parseNode function.
         */

        // Preprocess: remove all line ends,decode HTML entity,// fix ampersand and angle brackets and add body tag for HTML fragments
        $html = str_replace(array("n","r"),'',$html);
        $html = str_replace(array('&lt;','&gt;','&amp;'),array('_lt_','_gt_','_amp_'),$html);
        $html = html_entity_decode($html,ENT_QUOTES,'UTF-8');
        $html = str_replace('&','&amp;',$html);
        $html = str_replace(array('_lt_',array('&lt;',$html);

        if (false === $fullHTML) {
            $html = '<body>' . $html . '</body>';
        }

        // Load DOM
        $dom = new DOMDocument();
        $dom->preserveWhiteSpace = true;
        $dom->loadXML($html);
        $node = $dom->getElementsByTagName('body');

        //self::$phpWord = $element->getPhpWord();
        self::$stylesStack = array();

        self::parseNode($node->item(0),$element);
    }

    /**
     * parse Inline style of a node
     *
     * @param DOMNode $node Node to check on attributes and to compile a style array
     * @param array $styles is supplied,the inline style attributes are added to the already existing style
     * @return array
     */
    protected static function parseInlineStyle($node,$styles = array())
    {
        if (XML_ELEMENT_NODE == $node->nodeType) {
            $stylesStr = $node->getAttribute('style');
            $styles = self::parseStyle($node,$stylesStr,$styles);
        }
        else
        {
            // Just to balance the stack.
            // (make number of pushs = number of pops)
            self::pushStyles(array());
        } 

        return $styles;
    }

    /**
     * Parse a node and add a corresponding element to the parent element.
     *
     * @param DOMNode $node node to parse
     * @param PhpOfficePhpWordElementAbstractContainer $element object to add an element corresponding with the node
     * @param array $styles Array with all styles
     * @param array $data Array to transport data to a next level in the DOM tree,for example level of listitems
     * @return void
     */
    protected static function parseNode($node,$element,$styles = array(),$data = array())
    {
        // Populate styles array
        $styleTypes = array('font','paragraph','list','table','row','cell'); //@change
        foreach ($styleTypes as $styleType) {
            if (!isset($styles[$styleType])) {
                $styles[$styleType] = array();
            }
        }

        // Node mapping table
        $nodes = array(
                              // $method        $node   $element    $styles     $data   $argument1      $argument2
            'p'         => array('Paragraph',$node,$styles,null,null),'h1'        => array('Heading','Heading1','h2'        => array('Heading','Heading2','h3'        => array('Heading','Heading3','h4'        => array('Heading','Heading4','h5'        => array('Heading','Heading5','h6'        => array('Heading','Heading6','#text'     => array('Text','strong'    => array('Property','bold',true),'em'        => array('Property','italic','sup'       => array('Property','superScript','sub'       => array('Property','subScript',// @change
            //'table'     => array('Table','addTable',//'tr'        => array('Table','addRow',//'td'        => array('Table','addCell','table'     => array('Table','tr'        => array('Row','td'        => array('Cell','th'        => array('Cell','ul'        => array('List',$data,3,'ol'        => array('List',7,'li'        => array('ListItem',);

        $newElement = null;
        $keys = array('node','element','styles','data','argument1','argument2');

        if (isset($nodes[$node->nodeName])) {
            // Execute method based on node mapping table and return $newElement or null
            // Arguments are passed by reference
            $arguments = array();
            $args = array();
            list($method,$args[0],$args[1],$args[2],$args[3],$args[4],$args[5]) = $nodes[$node->nodeName];
            for ($i = 0; $i <= 5; $i++) {
                if ($args[$i] !== null) {
                    $arguments[$keys[$i]] = &$args[$i];
                }
            }
            $method = "parse{$method}";
            $newElement = call_user_func_array(array('PhpOfficePhpWordSharedHtml',$method),$arguments);

            // Retrieve back variables from arguments
            foreach ($keys as $key) {
                if (array_key_exists($key,$arguments)) {
                    $$key = $arguments[$key];
                }
            }
        }
        else
        {
            // Just to balance the stack.
            // Number of pushs = number of pops.
            self::pushStyles(array());
        }

        if ($newElement === null) {
            $newElement = $element;
        }

        self::parseChildNodes($node,$newElement,$data);

        // After the parent element be processed,// its styles are removed from stack.
        self::popStyles();
    }

    /**
     * Parse child nodes.
     *
     * @param DOMNode $node
     * @param PhpOfficePhpWordElementAbstractContainer $element
     * @param array $styles
     * @param array $data
     * @return void
     */
    private static function parseChildNodes($node,$data)
    {
        if ('li' != $node->nodeName) {
            $cNodes = $node->childNodes;
            if (count($cNodes) > 0) {
                foreach ($cNodes as $cNode) {
                    if (($element instanceof AbstractContainer) or ($element instanceof Table) or ($element instanceof Row)) { // @change
                        self::parseNode($cNode,$data);
                    }
                }
            }
        }
    }

    /**
     * Parse paragraph node
     *
     * @param DOMNode $node
     * @param PhpOfficePhpWordElementAbstractContainer $element
     * @param array &$styles
     * @return PhpOfficePhpWordElementTextRun
     */
    private static function parseParagraph($node,&$styles)
    {
        $elementStyles = self::parseInlineStyle($node,$styles['paragraph']);

        $newElement = $element->addTextRun($elementStyles);

        return $newElement;
    }

    /**
     * Parse heading node
     *
     * @param PhpOfficePhpWordElementAbstractContainer $element
     * @param array &$styles
     * @param string $argument1 Name of heading style
     * @return PhpOfficePhpWordElementTextRun
     *
     * @todo Think of a clever way of defining header styles,now it is only based on the assumption,that
     * Heading1 - Heading6 are already defined somewhere
     */
    private static function parseHeading($element,&$styles,$argument1)
    {
        $elementStyles = $argument1;

        $newElement = $element->addTextRun($elementStyles);

        return $newElement;
    }

    /**
     * Parse text node
     *
     * @param DOMNode $node
     * @param PhpOfficePhpWordElementAbstractContainer $element
     * @param array &$styles
     * @return null
     */
    private static function parseText($node,$styles['font']);

        $textStyles = self::getInheritedTextStyles();
        $paragraphStyles = self::getInheritedParagraphStyles();

        // Commented as source of bug #257. `method_exists` doesn't seems to work properly in this case.
        // @todo Find better error checking for this one
        // if (method_exists($element,'addText')) {
            $element->addText($node->nodeValue,$textStyles,$paragraphStyles);
        // }

        return null;
    }

    /**
     * Parse property node
     *
     * @param array &$styles
     * @param string $argument1 Style name
     * @param string $argument2 Style value
     * @return null
     */
    private static function parseProperty(&$styles,$argument1,$argument2)
    {
        $styles['font'][$argument1] = $argument2;

        return null;
    }

    /**
     * Parse table node
     *
     * @param DOMNode $node
     * @param PhpOfficePhpWordElementAbstractContainer $element
     * @param array &$styles
     * @param string $argument1 Method name
     * @return PhpOfficePhpWordElementAbstractContainer $element
     *
     * @todo As soon as TableItem,RowItem and CellItem support relative width and height
     */
    private static function parseTable($node,$argument1)
    {
        $elementStyles = self::parseInlineStyle($node,$styles['table']);

        $newElement = $element->addTable($elementStyles);

        // $attributes = $node->attributes;
        // if ($attributes->getNamedItem('width') !== null) {
            // $newElement->setWidth($attributes->getNamedItem('width')->value);
        // }

        // if ($attributes->getNamedItem('height') !== null) {
            // $newElement->setHeight($attributes->getNamedItem('height')->value);
        // }
        // if ($attributes->getNamedItem('width') !== null) {
            // $newElement=$element->addCell($width=$attributes->getNamedItem('width')->value);
        // }

        return $newElement;
    }

    private static function parseRow($node,$styles['row']);

        $newElement = $element->addRow(null,$elementStyles);

        return $newElement;
    }


    private static function parseCell($node,$argument1)
    {        
        $elementStyles = self::parseInlineStyle($node,$styles['cell']);

        $colspan = $node->getAttribute('colspan');        
        if (!empty($colspan))
            $elementStyles['gridSpan'] = $colspan-0;        

        $newElement = $element->addCell(null,$elementStyles);
        return $newElement;
    }

    /**
     * Parse list node
     *
     * @param array &$styles
     * @param array &$data
     * @param string $argument1 List type
     * @return null
     */
    private static function parseList(&$styles,&$data,$argument1)
    {
        if (isset($data['listdepth'])) {
            $data['listdepth']++;
        } else {
            $data['listdepth'] = 0;
        }
        $styles['list']['listType'] = $argument1;

        return null;
    }

    /**
     * Parse list item node
     *
     * @param DOMNode $node
     * @param PhpOfficePhpWordElementAbstractContainer $element
     * @param array &$styles
     * @param array $data
     * @return null
     *
     * @todo This function is almost the same like `parseChildNodes`. Merged?
     * @todo As soon as ListItem inherits from AbstractContainer or TextRun delete parsing part of childNodes
     */
    private static function parseListItem($node,$data)
    {
        $cNodes = $node->childNodes;
        if (count($cNodes) > 0) {
            $text = '';
            foreach ($cNodes as $cNode) {
                if ($cNode->nodeName == '#text') {
                    $text = $cNode->nodeValue;
                }
            }
            $element->addListItem($text,$data['listdepth'],$styles['font'],$styles['list'],$styles['paragraph']);
        }

        return null;
    }

    /**
     * Parse style
     *
     * @param DOMAttr $attribute
     * @param array $styles
     * @return array
     */
    private static function parseStyle($node,$styles)
    {
        // Parses element styles.
        $newStyles = array();

        if (!empty($stylesStr))
        {
            $properties = explode(';',trim($stylesStr," tnrx0B;"));
            foreach ($properties as $property) {
                list($cKey,$cValue) = explode(':',$property,2);
                $cValue = trim($cValue);
                switch (trim($cKey)) {
                    case 'text-decoration':
                        switch ($cValue) {
                            case 'underline':
                                $newStyles['underline'] = 'single';
                                break;
                            case 'line-through':
                                $newStyles['strikethrough'] = true;
                                break;
                        }
                        break;                
                    case 'text-align':
                        $newStyles['alignment'] = $cValue; // todo: any mapping?
                        break;
                    case 'color':
                        $newStyles['color'] = trim($cValue,"#");
                        break;
                    case 'background-color':
                        $newStyles['bgColor'] = trim($cValue,"#");
                        break;

                    // @change
                    case 'colspan':
                        $newStyles['gridSpan'] = $cValue-0;
                        break;
                    case 'font-weight':
                        if ($cValue=='bold')
                            $newStyles['bold'] = true;
                        break;                    
                    case 'width':
                        $newStyles = self::parseWidth($newStyles,$cValue);
                        break;
                    case 'border-width':
                        $newStyles = self::parseBorderStyle($newStyles,$cValue);
                        break;
                    case 'border-color':
                        $newStyles = self::parseBorderColor($newStyles,$cValue);
                        break;
                    case 'border':
                        $newStyles = self::parseBorder($newStyles,$cValue);
                        break;                    
                }
            }
        }

        // Add styles to stack.
        self::pushStyles($newStyles);

        // Inherit parent styles (including itself).
        $inheritedStyles = self::getInheritedStyles($node->nodeName);

        // Override default styles with the inherited ones.
        $styles = array_merge($styles,$inheritedStyles);       

        /* DEBUG
        if ($node->nodeName=='th')
        {
            echo '<pre>';
            print_r(self::$stylesStack);
            print_r($styles);
            //print_r($elementStyles);
            echo '</pre>';
        }
        */

        return $styles;
    }

    /**
    *  Parses the "width" style attribute,adding to styles
    *  array the corresponding PHPWORD attributes.
    */
    public static function parseWidth($styles,$cValue)
    {
        if (preg_match('/([0-9]+)px/',$cValue,$matches))
        {
            $styles['width'] = $matches[1];
            $styles['unit'] = 'dxa';
        }
        else if (preg_match('/([0-9]+)%/',$matches))
        {
            $styles['width'] = $matches[1]*50;
            $styles['unit'] = 'pct';
        }
        else if (preg_match('/([0-9]+)/',$matches))
        {
            $styles['width'] = $matches[1];
            $styles['unit'] = 'auto';
        }

        $styles['alignment'] = PhpOfficePhpWordSimpleTypeJcTable::START;

        return $styles;
    }

    /**
    *  Parses the "border-width" style attribute,adding to styles
    *  array the corresponding PHPWORD attributes.
    */
    public static function parseBorderWidth($styles,$cValue)
    {
        // border-width: 2px;
        if (preg_match('/([0-9]+)px/',$matches))
            $styles['borderSize'] = $matches[1];

        return $styles;
    }

    /**
    *  Parses the "border-color" style attribute,adding to styles
    *  array the corresponding PHPWORD attributes.
    */
    public static function parseBorderColor($styles,$cValue)
    {
        // border-color: #FFAACC;
        $styles['borderColor'] = $cValue;

        return $styles;
    }    

    /**
    *  Parses the "border" style attribute,adding to styles
    *  array the corresponding PHPWORD attributes.
    */
    public static function parseBorder($styles,$cValue)
    {
        if (preg_match('/([0-9]+)pxs+(#[a-fA-F0-9]+)s+solid+/',$matches))
        {
            $styles['borderSize'] = $matches[1];
            $styles['borderColor'] = $matches[2];
        }

        return $styles;
    }

    /**
    *  Return the inherited styles for text elements,*  considering current stack state.
    */
    public static function getInheritedTextStyles()
    {
        return self::getInheritedStyles('#text');
    }

    /**
    *  Return the inherited styles for paragraph elements,*  considering current stack state.
    */
    public static function getInheritedParagraphStyles()
    {
        return self::getInheritedStyles('p');
    }

    /**
    *  Return the inherited styles for a given nodeType,*  considering current stack state.
    */
    public static function  getInheritedStyles($nodeType)
    {
        $textStyles = array('color','italic');
        $paragraphStyles = array('color','alignment');

        // List of phpword styles relevant for each element types.
        $stylesMapping = array(
            'p'         => $paragraphStyles,'h1'        => $textStyles,'h2'        => $textStyles,'h3'        => $textStyles,'h4'        => $textStyles,'h5'        => $textStyles,'h6'        => $textStyles,'#text'     => $textStyles,'strong'    => $textStyles,'em'        => $textStyles,'sup'       => $textStyles,'sub'       => $textStyles,'table'     => array('width','borderSize','borderColor','unit'),'tr'        => array('bgColor','alignment'),'td'        => array('bgColor','th'        => array('bgColor','ul'        => $textStyles,'ol'        => $textStyles,'li'        => $textStyles,);

        $result = array();

        if (isset($stylesMapping[$nodeType]))
        {
            $nodeStyles = $stylesMapping[$nodeType];

            // Loop trough styles stack applying styles in
            // the right order.
            foreach (self::$stylesStack as $styles)
            {
                // Loop trough all styles applying only the relevants for
                // that node type.
                foreach ($styles as $name => $value)
                {
                    if (in_array($name,$nodeStyles))
                    {
                        $result[$name] = $value;
                    }
                }
            }
        }

        return $result;
    }


    /**
    *  Add the parent styles to stack,allowing
    *  children elements inherit from.
    */
    public static function pushStyles($styles)
    {
        self::$stylesStack[] = $styles;
    }

    /**
    *  Remove parent styles at end of recursion.
    */
    public static function popStyles()
    {
        array_pop(self::$stylesStack);
    }
}

有了这个新结构,很容易添加新的样式支持.您只需要在getInheritedStyles()方法中编辑parseStyle()方法和$stylesMapping变量.希望能帮助到你.

使用示例:

<?php
include_once 'Sample_Header.php';

// New Word Document
echo date('H:i:s'),' Create new PhpWord object',EOL;
$phpWord = new PhpOfficePhpWordPhpWord();

$section = $phpWord->addSection();
$html  = '<table style="width: 50%; border: 6px #0000FF solid;">'.
            '<thead>'.
                '<tr style="background-color: #FF0000; text-align: center; color: #FFFFFF; font-weight: bold; ">'.
                    '<th>a</th>'.
                    '<th>b</th>'.
                    '<th>c</th>'.
                '</tr>'.
            '</thead>'.
            '<tbody>'.
                '<tr><td>1</td><td colspan="2">2</td></tr>'.
                '<tr><td>4</td><td>5</td><td>6</td></tr>'.
            '</tbody>'.
         '</table>';


PhpOfficePhpWordSharedHtml::addHtml($section,$html);

// Save file
echo write($phpWord,basename(__FILE__,'.php'),$writers);
if (!CLI) {
    include_once 'Sample_Footer.php';
}

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读