PHP cdata 处理(详细介绍)

前端技术 2023/09/08 PHP
当时在网上找了一个CDATA的转换器, 修改之后, 将CDATA标签给过滤掉。如下
复制代码 代码如下:

 // States:
        //
        //     \'out\'
        //     \'<\'
        //     \'<!\'
        //     \'<![\'
        //     \'<![C\'
        //     \'<![CD\'
        //     \'<![CDAT\'
        //     \'<![CDATA\'
        //     \'in\'
        //     \']\'
        //     \']]\'
        //
        // (Yes, the states a represented by strings.)
        //
        $state = \'out\';
        $a = str_split($xml);
        $new_xml = \'\';
        foreach ($a AS $k => $v) {
            // Deal with \"state\".
            switch ( $state ) {
                case \'out\':
                    if ( \'<\' == $v ) {
                        $state = $v;
                    } else {
                        $new_xml .= $v;
                    }
                break;
                case \'<\':
                    if ( \'!\' == $v  ) {
                        $state = $state . $v;
                    } else {
                        $new_xml .= $state . $v;
                        $state = \'out\';
                    }
                break;
                 case \'<!\':
                    if ( \'[\' == $v  ) {
                        $state = $state . $v;
                    } else {
                        $new_xml .= $state . $v;
                        $state = \'out\';
                    }
                break;
                case \'<![\':
                    if ( \'C\' == $v  ) {
                        $state = $state . $v;
                    } else {
                        $new_xml .= $state . $v;
                        $state = \'out\';
                    }
                break;
                case \'<![C\':
                    if ( \'D\' == $v  ) {
                        $state = $state . $v;
                    } else {
                        $new_xml .= $state . $v;
                        $state = \'out\';
                    }
                break;
                case \'<![CD\':
                    if ( \'A\' == $v  ) {
                        $state = $state . $v;
                    } else {
                        $new_xml .= $state . $v;
                        $state = \'out\';
                    }
                break;
                case \'<![CDA\':
                    if ( \'T\' == $v  ) {
                        $state = $state . $v;
                    } else {
                        $new_xml .= $state . $v;
                        $state = \'out\';
                    }
                break;
                case \'<![CDAT\':
                    if ( \'A\' == $v  ) {
                        $state = $state . $v;
                    } else {
                        $new_xml .= $state . $v;
                        $state = \'out\';
                    }
                break;
                case \'<![CDATA\':
                    if ( \'[\' == $v  ) {
                        $cdata = \'\';
                        $state = \'in\';
                    } else {
                        $new_xml .= $state . $v;
                        $state = \'out\';
                    }
                break;
                case \'in\':
                    if ( \']\' == $v ) {
                        $state = $v;
                    } else {
                        $cdata .= $v;
                    }
                break;
                case \']\':
                    if (  \']\' == $v  ) {
                        $state = $state . $v;
                    } else {
                        $cdata .= $state . $v;
                        $state = \'in\';
                    }
                break;
                case \']]\':
   if (  \'>\' == $v  ) {
    $new_xml .= htmlentities($cdata);
#       $new_xml.= $cdata;
//                        $new_xml .= str_replace(\'>\',\'>\',
  //                                  str_replace(\'>\',\'<\',
    //                                str_replace(\'\"\',\'\"\',
      //                              str_replace(\'&\',\'&\',
        //                            $cdata))));
                        $state = \'out\';
                    } else {
                        $cdata .= $state . $v;
                        $state = \'in\';
                    }
                break;
            } // switch
        }
        //
        // Return.
        //
            return $new_xml;

最近发现,总是有alert发出来, 说是simplexml解析出错。

发现是原来有xml的数据是<![CDATA[domain[test]]] >. 出现了连续的3个], 造成上面的解析函数不能处理。

而且这个问题很难修正, 你不知道下次会不会有4, 5个]出现。

所以决定还是将这段解析 的代码换成DOM XML,本身 DOM的处理还是比较简单的,

包含DOMElement, DOMDocument, DOMNodeList, DOMNode几个 component.

对于 DOMNode有nodeValue, nodeType, nodeName的成员函数。

首先先用loadXML将string转化为DOMDocument对像, 再用getElementsByTagName转化为DOMNodeList对像, 再使用->item(0)转化为DOMNOde, 然后就可以使用上面的三种方法了。

对于 <aa color=\'red\'>test</aa>这种xml标签, 要使用 attribute函数。

本文地址:https://www.stayed.cn/item/21574

转载请注明出处。

本站部分内容来源于网络,如侵犯到您的权益,请 联系我

我的博客

人生若只如初见,何事秋风悲画扇。