
xmlTreeParse(file, ignoreBlanks=T, handlers=NULL, replaceEntities=F,asText=F, trim=T, validate=F, getDTD=F, isURL=F)
isURL
.
Additionally, the file can be compressed (gzip)
and is read directly without the user havingfile
argument refers to a URL
(accessible via ftp or http) or a regular file on the system.
If asText
is TRUE, this should not be specified.
The function attempts to determine whether the
data sourfile
, version
and children
.file
version
children
XMLNode
.
These are made up of 4 fields.
name
attributes
children
value
XMLNode
, such as
XMLComment
, XMLProcessingInstruction
,
XMLEntityRef
are used.If the value of the argument getDTD is TRUE, the return value is a
list of length 2. The first element is as the document as described
above. The second element is a list containing the external and
internal DTDs. Each of these contains 2 lists - one for elements
and another for entities. See parseDTD
.
handlers
argument is used similarly
to those specifid in xmlEventParse.
When an XML tag (element) is processed,
we look for a function in this collection
with the same name as the tag's name.
If this is not found, we look for one named
startElement. If this is not found, we use the default
built in converter.
The same works for comments, entity references, etc.
The default entries should be named
comment
, startElement
,
externalEntity
,
processingInstruction
text
.
They should take the XMLnode as their first argument.
In the future, other information may be passed via ...,
for example, the depth in the tree, etc.
Specifically, the second argument will be the parent node into which they
are being added, but this is not currently implemented,
so should have a default value (NULL
).Each of these functions can return arbitrary values that are then
entered into the tree in place of the default node passed to the
function as the first argument. This allows the caller to generate
the nodes of the resulting document tree exactly as they wish. If the
function returns NULL
, in the future, we will drop this node
from the tree.
fileName <- system.file("data/test.xml", pkg="XML")
# parse the document and return it in its standard format.
xmlTreeParse(fileName)
# parse the document, discarding comments.
xmlTreeParse(fileName, handlers=list("comment"=function(x, parent){NULL}))
invisible(xmlTreeParse(fileName,
handlers=list(entity=function(x) {
cat("In entity",x$name, x$value,"")}
)
)
)
# Parse some XML text.
# Read the text from the file
xmlText <- paste(scan(fileName, what="",sep=""),"", collapse="")
xmlTreeParse(xmlText, asText=T)
# Read a MathML document and convert each node
# so that the primary class is
# <name of tag>MathML
# so that we can use method dispatching when processing
# it rather than conditional statements on the tag name.
# See plotMathML() in examples/.
fileName <- system.file("data/mathml.xml",pkg="XML")
m <- xmlTreeParse(fileName,
handlers=list(startElement=function(node){
cname <- paste(xmlName(node),"MathML",sep="",collapse="")
class(node) <- c(cname, class(node));
node
}))
# Parse an XML document directly from a URL.
# Requires Internet access.
xmlTreeParse("http://www.omegahat.org/Scripts/Data/mtcars.xml", asText=T)
Run the code above in your browser using DataLab