Search Oracle Blogs

List of Blogs maintained by Paweł Barut.


Thursday, January 11, 2007

Oracle XMLType: exctractvalue vs. extract

I always thought that
exctarctvalue(xml, '/Node')
is equivalent to
xml.extract('/Node/text()').getstringval()
Usually it is true, but in some cases it is not. When node contains special characters like &, >, < etc... then results are different. Let me give an example:
SQL> set define off
SQL> select aa.a.extract('Node/text()').getStringVal() extr
  2       , extractvalue(aa.a, 'Node') extractval
  3    from (select XMLRoot(
  4              XMLElement("Node", 'test & < and >'),
  5             VERSION '1.0') a from dual) aa;

EXTR
----------------------------------------
EXTRACTVAL
----------------------------------------
test &amp; &lt; and &gt;
test & < and >
First look on documentation even confirms my expectations:
"extractValue – This takes an XPath expression and returns the corresponding leaf node. The XPath expression passed to extractValue should identify a single attribute or an element that has precisely one text node child. The result is returned in the appropriate SQL data type. Function extractValue is essentially a shortcut for extract plus either getStringVal() or getNumberVal()."
XML DB Developer's Guide - b14259.pdf page 1-17.

But there is small note on that:
"Note: Function extractValue and XMLType method getStringVal() differ in their treatment of entity encoding. Function extractValue unescapes any encoded entities; method getStringVal() returns the data with entity encoding intact."
XML DB Developer's Guide - b14259.pdf page 4-10.



Cheers, Paweł

5 comments:

Barbara said...

Brimomediu zentos fazionao zenhor qualesso palconso! Penacquelosi sterridouvim usolvago valimagiri gravaidadi mistempo res um non... Kerida laso xiro zomo.

Paweł Barut said...

Barbara, thanks for comment, but I do not understand anything. Write in english, please.

Paul Jones said...

This is interesting, but it isn't the behaviour I see with escaped Unicode characters (e.g. a Czech character encoded as &367;). Using extract() still tries to unescape the entity for me.

Johnnie said...

Thanks for this information, after doing a lot of research this was an easy solution to my issue!

Johnnie said...

Thank you for sharing this information. This was a simple solution to the issue I was having with XMLAGG displaying & and < characters as escaped characters. Thanks!

 

Copyright © Paweł Barut
Printing from DOS to USB Printer