XPathEngine**********************************************************************************************
=============================================================================================== X P a t h E n g i n e - Class =============================================================================================== **********************************************************************************************
Located in /web/phpsysinfo/includes/XPath.class.php (line 854)
XPathBase | --XPathEngine
| Class | Description |
|---|---|
XPath
|
********************************************************************************************** |
(mixed)
exportAsHtml
([$absoluteXPath $absoluteXPath = ''], [$hilighXpathList $hilightXpathList = array()])
(string)
exportToFile
($fileName $fileName, [$absoluteXPath $absoluteXPath = ''], [$xmlHeader $xmlHeader = NULL])
(mixed)
_evaluateOperator
($left $left, $right $operator, $operator $right, $operatorType $operatorType, $context $context)
(mixed)
_export
([$absoluteXPath $absoluteXPath = ''], [$xmlHeader $xmlHeader = NULL], [$hilightXpath $hilightXpathList = ''])
mixed
$axes
= array ( 'ancestor', 'ancestor_or_self', 'attribute', 'child', 'descendant',
mixed
$axPathLiterals
= array() (line 896)
mixed
$emptyNode
= array(
mixed
$errorStrings
= array(
mixed
$functions
= array ( 'last', 'position', 'count', 'id', 'name',
mixed
$nodeIndex
= array() (line 899)
mixed
$nodeRoot
= array() (line 900)
mixed
$nodeStack
= array() (line 917)
mixed
$operators
= array( ' or ', ' and ', '=', '!=', '<=', '<', '>=', '>',
mixed
$parsedTextLocation
= '' (line 922)
mixed
$parseOptions
= array() (line 921)
mixed
$parseSkipWhiteCache
= 0 (line 924)
mixed
$parseStackIndex
= 0 (line 918)
mixed
$parsInCData
= 0 (line 923)
mixed
$_indexIsDirty
= FALSE (line 913)
Inherited from XPathBase
XPathBase::$aDebugFunctions
XPathBase::$aDebugOpenLinks
XPathBase::$bClassProfiling
XPathBase::$bDebugXmlParse
XPathBase::$iDebugNextLinkNumber
XPathBase::$_lastError
Constructor
Optionally you may call this constructor with the XML-filename to parse and the XML option vector. Each of the entries in the option vector will be passed to xml_parser_set_option().
A option vector sample: $xmlOpt = array(XML_OPTION_CASE_FOLDING => FALSE, XML_OPTION_SKIP_WHITE => TRUE);
Clone a node and it's child nodes.
NOTE: If the node has children you *MUST* use the reference operator! E.g. $clonedNode =& cloneNode($node); Otherwise the children will not point back to the parent, they will point back to your temporary variable instead.
Decodes the character set entities in the given string.
This function is given for convenience, as all text strings or attributes are going to come back to you with their entities still encoded. You can use this function to remove these entites.
It makes use of the get_html_translation_table(HTML_ENTITIES) php library call, so is limited in the same ways. At the time of writing this seemed be restricted to iso-8859-1
### Provide an option that will do this by default.
Compare two nodes to see if they are equal (point to the same node in the doc)
2 nodes are considered equal if the absolute XPath is equal.
Alias for the match function
Returns the containing XML as marked up HTML with specified nodes hi-lighted
Given a context this function returns the containing XML
Generates a XML string with the content of the current document and writes it to a file.
Per default includes a <?xml ...> tag at the start of the data too.
Get the node defined by the $absoluteXPath.
Get the absolute XPath of a node that is in a document tree.
Retrieves the absolute parent XPath query.
The parents stored in the tree are only relative parents...but all the parent information is stored in the XPath query itself...so instead we use a function to extract the parent from the absolute Xpath query
Returns the property/ies you want.
if $param is not given, all properties will be returned in a hash.
Returns TRUE if the given node has child nodes below it
Reads a file or URL and parses the XML data.
Parse the XML source and (upon success) store the information into an internal structure.
Reads a string and parses the XML data.
Parse the XML source and (upon success) store the information into an internal structure. If a parent xpath is given this means that XML data is to be *appended* to that parent.
### If a function uses setLastError(), then say in the function header that getLastError() is useful.
Matches (evaluates) an XPath query
This method tries to evaluate an XPath query by parsing it. A XML source must have been imported before this method is able to work.
Update nodeIndex and every node of the node-tree.
Call after you have finished any tree modifications other wise a match with an xPathQuery will produce wrong results. The $this->nodeIndex[] is recreated and every nodes optimization data is updated. The optimization data is all the data that is duplicate information, would just take longer to find. Child nodes with value NULL are removed from the tree.
By default the modification functions in this component will automatically re-index the nodes in the tree. Sometimes this is not the behaver you want. To surpress the reindex, set the functions $autoReindex to FALSE and call reindexNodeTree() at the end of your changes. This sometimes leads to better code (and less CPU overhead).
Sample: ======= Given the xml is <AAA><B/>.<B/>.<B/></AAA> | Goal is <AAA>.<B/>.</AAA> (Delete B[1] and B[3]) $xPathSet = $xPath->match('//B'); # Will result in array('/AAA[1]/B[1]', '/AAA[1]/B[2]', '/AAA[1]/B[3]'); Three ways to do it. 1) Top-Down (with auto reindexing) - Safe, Slow and you get easily mix up with the the changing node index removeChild('/AAA[1]/B[1]'); // B[1] removed, thus all B[n] become B[n-1] !! removeChild('/AAA[1]/B[2]'); // Now remove B[2] (That originaly was B[3]) 2) Bottom-Up (with auto reindexing) - Safe, Slow and the changing node index (caused by auto-reindex) can be ignored. for ($i=sizeOf($xPathSet)-1; $i>=0; $i--) { if ($i==1) continue; removeChild($xPathSet[$i]); } 3) // Top-down (with *NO* auto reindexing) - Fast, Safe as long as you call reindexNodeTree() foreach($xPathSet as $xPath) { // Specify no reindexing if ($xPath == $xPathSet[1]) continue; removeChild($xPath, $autoReindex=FALSE); // The object is now in a slightly inconsistent state. } // Finally do the reindex and the object is consistent again reindexNodeTree();
Resets the object so it's able to take a new xml sting/file
Constructing objects is slow. If you can, reuse ones that you have used already by using this reset() function.
Alternative way to control whether case-folding is enabled for this XML parser.
Short cut to setXmlOptions(XML_OPTION_CASE_FOLDING, TRUE/FALSE)
When it comes to XML, case-folding simply means uppercasing all tag- and attribute-names (NOT the content) if set to TRUE. Note if you have this option set, then your XPath queries will also be case folded for you.
Alternative way to control whether skip-white-spaces is enabled for this XML parser.
Short cut to setXmlOptions(XML_OPTION_SKIP_WHITE, TRUE/FALSE)
When it comes to XML, skip-white-spaces will trim the tag content. An XML file with no whitespace will be faster to process, but will make your data less human readable when you come to write it out.
Running with this option on will slow the class down, so if you want to speed up your XML, then run it through once skipping white-spaces, then write out the new version of your XML without whitespace, then use the new XML file with skip whitespaces turned off.
Set an xml_parser_set_option()
Sets a number of xml_parser_set_option()s
Get a the content of a node text part or node attribute.
If the absolute Xpath references an attribute (Xpath ends with @ or attribute::), then the text value of that node-attribute is returned. Otherwise the Xpath is referencing a text part of the node. This can be either a direct reference to a text part (Xpath ends with text()[<nr>]) or indirect reference (a simple abs. Xpath to a node). 1) Direct Reference (xpath ends with text()[<part-number>]): If the 'part-number' is omitted, the first text-part is assumed; starting by 1. Negative numbers are allowed, where -1 is the last text-part a.s.o. 2) Indirect Reference (a simple abs. Xpath to a node): Default is to return the *whole text*; that is the concated text-parts of the matching node. (NOTE that only in this case you'll only get a copy and changes to the returned value wounld have no effect). Optionally you may pass a parameter $textPartNr to define the text-part you want; starting by 1. Negative numbers are allowed, where -1 is the last text-part a.s.o.
NOTE I : The returned value can be fetched by reference E.g. $text =& wholeText(). If you wish to modify the text. NOTE II: text-part numbers out of range will return FALSE SIDENOTE:The function name is a suggestion from W3C in the XPath specification level 3.
Adds a literal to our array of literals
In order to make sure we don't interpret literal strings as XPath expressions, we have to encode literal strings so that we know that they are not XPaths.
Returns the given string as a literal reference.
Checks whether a node matches a node-test.
This method checks whether a node in the document matches a given node-test. A node test is something like text(), node(), or an element name.
Checks whether a node matches predicates.
This method checks whether a list of nodes passed to this method match a given list of predicates.
Evaluates an XPath Expr
$this->evaluate() is the entry point and does some inits, while this function is called recursive internaly for every sub-xPath expresion we find. It handles the following syntax, and calls evaluatePathExpr if it finds that none of this grammer applies.
http://www.w3.org/TR/xpath#section-Basics
[14] Expr ::= OrExpr [21] OrExpr ::= AndExpr | OrExpr 'or' AndExpr [22] AndExpr ::= EqualityExpr | AndExpr 'and' EqualityExpr [23] EqualityExpr ::= RelationalExpr | EqualityExpr '=' RelationalExpr | EqualityExpr '!=' RelationalExpr [24] RelationalExpr ::= AdditiveExpr | RelationalExpr '<' AdditiveExpr | RelationalExpr '>' AdditiveExpr | RelationalExpr '<=' AdditiveExpr | RelationalExpr '>=' AdditiveExpr [25] AdditiveExpr ::= MultiplicativeExpr | AdditiveExpr '+' MultiplicativeExpr | AdditiveExpr '-' MultiplicativeExpr [26] MultiplicativeExpr ::= UnaryExpr | MultiplicativeExpr MultiplyOperator UnaryExpr | MultiplicativeExpr 'div' UnaryExpr | MultiplicativeExpr 'mod' UnaryExpr [27] UnaryExpr ::= UnionExpr | '-' UnaryExpr [18] UnionExpr ::= PathExpr | UnionExpr '|' PathExpr
NOTE: The effect of the above grammar is that the order of precedence is (lowest precedence first): 1) or 2) and 3) =, != 4) <=, <, >=, > 5) +, - 6) *, div, mod 7) - (negate) 8) |
Evaluates an XPath function
This method evaluates a given XPath function with its arguments on a specific node of the document.
Evaluate the result of an operator whose operands have been evaluated
If the operator type is not "NodeSet", then neither the left or right operators will be node sets, as the processing when one or other is an array is complex, and should be handled by the caller.
Evaluates an XPath PathExpr
It handles the following syntax:
http://www.w3.org/TR/xpath#node-sets http://www.w3.org/TR/xpath#NT-LocationPath http://www.w3.org/TR/xpath#path-abbrev http://www.w3.org/TR/xpath#NT-Step
[19] PathExpr ::= LocationPath | FilterExpr | FilterExpr '/' RelativeLocationPath | FilterExpr '//' RelativeLocationPath [20] FilterExpr ::= PrimaryExpr | FilterExpr Predicate [1] LocationPath ::= RelativeLocationPath | AbsoluteLocationPath [2] AbsoluteLocationPath ::= '/' RelativeLocationPath? | AbbreviatedAbsoluteLocationPath [3] RelativeLocationPath ::= Step | RelativeLocationPath '/' Step | AbbreviatedRelativeLocationPath [4] Step ::= AxisSpecifier NodeTest Predicate* | AbbreviatedStep [5] AxisSpecifier ::= AxisName '::' | AbbreviatedAxisSpecifier [10] AbbreviatedAbsoluteLocationPath ::= '//' RelativeLocationPath [11] AbbreviatedRelativeLocationPath ::= RelativeLocationPath '//' Step [12] AbbreviatedStep ::= '.' | '..' [13] AbbreviatedAxisSpecifier ::= '@'?
If you expand all the abbreviated versions, then the grammer simplifies to:
[19] PathExpr ::= RelativeLocationPath | '/' RelativeLocationPath? | FilterExpr | FilterExpr '/' RelativeLocationPath [20] FilterExpr ::= PrimaryExpr | FilterExpr Predicate [3] RelativeLocationPath ::= Step | RelativeLocationPath '/' Step [4] Step ::= AxisName '::' NodeTest Predicate*
Conceptually you can say that we should split by '/' and try to treat the parts as steps, and if that fails then try to treat it as a PrimaryExpr.
Evaluates an XPath PrimaryExpr
http://www.w3.org/TR/xpath#section-Basics
[15] PrimaryExpr ::= VariableReference | '(' Expr ')' | Literal | Number | FunctionCall
Evaluate a step from a XPathQuery expression at a specific contextPath.
Steps are the arguments of a XPathQuery when divided by a '/'. A contextPath is a absolute XPath (or vector of XPaths) to a starting node(s) from which the step should be evaluated.
Generates a XML string with the content of the current document.
This is the start for extracting the XML-data from the node-tree. We do some preperations and then call _InternalExport() to fetch the main XML-data. You optionally may pass xpath to any node that will then be used as top node, to extract XML-parts of the document. Default is '', meaning to extract the whole document.
You also may pass a 'xmlHeader' (usually something like <?xml version="1.0"? > that will overwrite any other 'xmlHeader', if there was one in the original source. If there wasn't one in the original source, and you still don't specify one, then it will use a default of <?xml version="1.0"? > Finaly, when exporting to HTML, you may pass a vector xPaths you want to hi-light. The hi-lighted tags and attributes will receive a nice color.
NOTE I : The output can have 2 formats: a) If "skip white spaces" is/was set. (Not Recommended - slower) The output is formatted by adding indenting and carriage returns. b) If "skip white spaces" is/was *NOT* set. 'as is'. No formatting is done. The output should the same as the the original parsed XML source.
Create the ids that are accessable through the generate-id() function
Retrieves axis information from an XPath query step.
This method tries to extract the name of the axis and its node-test from a given step of an XPath query at a given node. If it can't parse the step, then we treat it as a PrimaryExpr.
[4] Step ::= AxisSpecifier NodeTest Predicate* | AbbreviatedStep [5] AxisSpecifier ::= AxisName '::' | AbbreviatedAxisSpecifier [12] AbbreviatedStep ::= '.' | '..' [13] AbbreviatedAxisSpecifier ::= '@'?
[7] NodeTest ::= NameTest | NodeType '(' ')' | 'processing-instruction' '(' Literal ')' [37] NameTest ::= '*' | NCName ':' '*' | QName [38] NodeType ::= 'comment' | 'text' | 'processing-instruction' | 'node'
Look for operators in the expression
Parses through the given expression looking for operators. If found returns the operands and the operator in the resulting array.
Handles the XPath ancestor axis.
Handles the XPath ancestor-or-self axis.
This method handles the XPath ancestor-or-self axis.
Handles the XPath attribute axis.
Handles the XPath child axis.
This method handles the XPath child axis. It essentially filters out the children to match the name specified after the '/'.
Handles the XPath descendant axis.
Handles the XPath descendant-or-self axis.
Handles the XPath following axis.
Handles the XPath following-sibling axis.
Handles the XPath namespace axis.
Handles the XPath parent axis.
Handles the XPath preceding axis.
Handles the XPath preceding-sibling axis.
Handles the XPath self axis.
Handles character data while parsing.
While parsing a XML document for each character data this method is called. It'll add the character data to the document tree.
Default handler for the XML parser.
While parsing a XML document for string not caught by one of the other handler functions, we end up here.
Handles closing XML tags while parsing.
While parsing a XML document for each closing tag this method is called.
Handles the XPath function boolean.
http://www.w3.org/TR/xpath#section-Boolean-Functions
Handles the XPath function ceiling.
Handles the XPath function concat.
Handles the XPath function contains.
Handles the XPath function count.
Handles the XPath function FALSE.
Handles the XPath function floor.
Handles the XPath function generate-id.
Produce a unique id for the first node of the node set.
Example usage, produces an index of all the nodes in an .xml document, where the content of each "section" is the exported node as XML.
$aFunctions = $xPath->match('//');
foreach ($aFunctions as $Function) { $id = $xPath->match("generate-id($Function)"); echo "<a href='#$id'>$Function</a>
"; }
foreach ($aFunctions as $Function) { $id = $xPath->match("generate-id($Function)"); echo "<h2 id='$id'>$Function</h2>"; echo htmlspecialchars($xPath->exportAsXml($Function)); }
Handles the XPath function id.
Handles the XPath function lang.
Handles the XPath function last.
Handles the XPath function name.
Handles the XPath function normalize-space.
The normalize-space function returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space. If the argument is omitted, it defaults to the context node converted to a string, in other words the string-value of the context node
Handles the XPath function not.
Handles the XPath function number.
http://www.w3.org/TR/xpath#section-Number-Functions
Handles the XPath function position.
Handles the XPath function round.
Handles the XPath function starts-with.
Handles the XPath function string.
http://www.w3.org/TR/xpath#section-String-Functions
Handles the XPath function string-length.
Handles the XPath function substring.
Handles the XPath function substring-after.
Handles the XPath function substring-before.
Handles the XPath function sum.
Handles the XPath function translate.
Handles the XPath function TRUE.
Handles the XPath function x-lower.
lower case a string. string x-lower(string)
Handles the XPath function x-upper.
upper case a string. string x-upper(string)
Handles processing instruction (PI)
A processing instruction has the following format: <? target data ? > e.g. <? dtd version="1.0" ? >
Currently I have no bether idea as to left it 'as is' and treat the PI data as normal text (and adding the surrounding PI-tags <? ? >).
Handles opening XML tags while parsing.
While parsing a XML document for each opening tag this method is called. It'll add the tag found to the tree of document nodes.
Adds a new node to the XML document tree during xml parsing.
This method adds a new node to the tree of nodes of the XML document being handled by this class. The new node is created according to the parameters passed to this method. This method is a much watered down version of appendChild(), used in parsing an xml file only.
It is assumed that adding starts with root and progresses through the document in parse order. New nodes must have a corresponding parent. And once we have read the </> tag for the element we will never need to add any more data to that node. Otherwise the add will be ignored or fail.
The function is faciliated by a nodeStack, which is an array of nodes that we have yet to close.
Export the xml document starting at the named node.
Here's where the work is done for reindexing (see reindexNodeTree)
Parse out the literals of an XPath expression.
Instead of doing a full lexical parse, we parse out the literal strings, and then Treat the sections of the string either as parts of XPath or literal strings. So this function replaces each literal it finds with a literal reference, and then inserts the reference into an array of strings that we can access. The literals can be accessed later from the literals associative array.
Example: XPathExpr = /AAA[@CCC = "hello"]/BBB[DDD = 'world'] => literals: array("hello", "world") return value: /AAA[@CCC = $1]/BBB[DDD = $2]
Note: This does not interfere with the VariableReference syntactical element, as these elements must not start with a number.
Sort an xPathSet by doc order.
Obtain the string value of an object
http://www.w3.org/TR/xpath#dt-string-value
"For every type of node, there is a way of determining a string-value for a node of that type. For some types of node, the string-value is part of the node; for other types of node, the string-value is computed from the string-value of descendant nodes."
Translate all ampersands to it's literal entities '&' and back.
I wasn't aware of this problem at first but it's important to understand why we do this. At first you must know: a) PHP's XML parser *translates* all entities to the equivalent char E.g. < is returned as '<' b) PHP's XML parser (in V 4.1.0) has problems with most *literal* entities! The only one's that are recognized are &, < > and ". *ALL* others (like © a.s.o.) cause an XML_ERROR_UNDEFINED_ENTITY error. I reported this as bug at http://bugs.php.net/bug.php?id=15092 (It turned out not to be a 'real' bug, but one of those nice W3C-spec things).
Forget position b) now. It's just for info. Because the way we will solve a) will also solve b) too.
THE PROBLEM To understand the problem, here a sample: Given is the following XML: "<AAA> < > </AAA>" Try to parse it and PHP's XML parser will fail with a XML_ERROR_UNDEFINED_ENTITY becaus of the unknown litteral-entity ' '. (The numeric equivalent ' ' would work though). Next try is to use the numeric equivalent 160 for ' ', thus "<AAA> <   > </AAA>" The data we receive in the tag <AAA> is " < > ". So we get the *translated entities* and NOT the 3 entities <   >. Thus, we will not even notice that there were entities at all! In *most* cases we're not able to tell if the data was given as entity or as 'normal' char. E.g. When receiving a quote or a single space were not able to tell if it was given as 'normal' char or as or ". Thus we loose the entity-information of the XML-data!
THE SOLUTION The better solution is to keep the data 'as is' by replacing the '&' before parsing begins. E.g. Taking the original input from above, this would result in "<AAA> &lt; &nbsp; &gt; </AAA>" The data we receive now for the tag <AAA> is " < > ". and that's what we want.
The bad thing is, that a global replace will also replace data in section that are NOT translated by the PHP XML-parser. That is comments (<!-- -->), IP-sections (stuff between <? ? >) and CDATA-block too. So all data comming from those sections must be reversed. This is done during the XML parse phase. So: a) Replacement of all '&' in the XML-source. b) All data that is not char-data or in CDATA-block have to be reversed during the XML-parse phase.
Inherited From XPathBase
XPathBase::XPathBase()
XPathBase::getLastError()
XPathBase::reset()
XPathBase::setVerbose()
XPathBase::_afterstr()
XPathBase::_beginDebugFunction()
XPathBase::_bracketExplode()
XPathBase::_bracketsCheck()
XPathBase::_closeDebugFunction()
XPathBase::_displayError()
XPathBase::_displayMessage()
XPathBase::_getEndGroups()
XPathBase::_prestr()
XPathBase::_printContext()
XPathBase::_ProfBegin()
XPathBase::_ProfEnd()
XPathBase::_ProfileToHtml()
XPathBase::_searchString()
XPathBase::_setLastError()
XPathBase::_treeDump()
Documentation generated on Mon, 06 Feb 2012 01:11:17 +0100 by phpDocumentor 1.4.0