I've stepped today at ORASEC blog by Paul Wright. I've found this issue a big problem. Anybody that is using Oracle client on windows can quite easy get DBA rights by editing client9.dll file. I've to check if this is still problem with latest Oracle 10.2.0.3 release.
Cheers, Paweł
Search Oracle Blogs
Saturday, December 30, 2006
Thursday, December 21, 2006
Applying Oracle 10.2.0.3 patch
Today I've installed patch 10.2.0.3 for Oracle on Windows on top of 10.2.0.2 database. I've noticed that my spool file generated by upgrade script was 33 MB big. It occurred that source of all oracle build-in packages were spooled. It was terrible experience to search for errors in this log file, as searching for 'ORA-' give hundreds results and I had to figure out if it is in comment of package, or real error.
Fortunately patch was installed successfully and everything runs fine.
Cheers, Paweł
Fortunately patch was installed successfully and everything runs fine.
Cheers, Paweł
Thursday, December 07, 2006
UTL_MATCH - String Similarity in Oracle
You
can compare string using equality operator (=), or using similarity
operator (LIKE). But there are cases where it is not enough. You Can
use UTL_MATCH package to calculate string
similarity index. This packages offers 4 functions that take two
strings as parameters:
Lets test it on NULLs:
All samples were run on Oracle 10g XE
Hope You find it usefull.
Paweł
- edit_distance - algorithm by Levenshtein - returns number of edits that must be done to change one string into second,
- edit_distance_similarity - normalized results of edit_distance in percents - integer values,
- jaro_winkler - returns similarity based on Jaro-Winkler distance algorithm,
- jaro_winkler_similarity - same as above but presented as integer in range 0-100.
create table countries(name varchar2(15) not null);Now we can take a look at results. Lets compare to miss-spelled country name: 'Slovnia'
insert into countries values ('Poland');
insert into countries values ('Germany');
insert into countries values ('United States');
insert into countries values ('Portugal');
insert into countries values ('Czech Republic');
insert into countries values ('China');
insert into countries values ('Slovakia');
insert into countries values ('Slovenia');
commit;
select nameAbove we can observe differences in algorithms.
,to_char(utl_match.edit_distance(name, 'Slovnia'),'999') edit_dist
,to_char(utl_match.edit_distance_similarity(name, 'Slovnia'),'999') edit_dist_sim
,to_char(utl_match.jaro_winkler(name, 'Slovnia'),'999d9999') jaro_winkler
,to_char(utl_match.jaro_winkler_similarity(name, 'Slovnia'),'999') jaro_winkler_sim
from countries
order by jaro_winkler_sim desc;
NAME EDIT EDIT JARO_WINK JARO
-------------------- ---- ---- --------- ----
Slovenia 1 88 .9750 97
Slovakia 2 75 .8881 88
China 5 29 .5619 56
United States 12 8 .5531 55
Poland 6 15 .5317 53
Portugal 7 13 .5119 51
Germany 7 0 .3571 35
Czech Republic 13 8 .0000 0
Lets test it on NULLs:
SQL> select to_char(utl_match.edit_distance('test', NULL),'999')We can see that using edit_distance on NULLs migth be dengerous.
2 edit_dist
3 ,to_char(utl_match.edit_distance_similarity('test', NULL),'999')
4 edit_dist_sim
5 ,to_char(utl_match.jaro_winkler('test', NULL),'999d9999')
6 jaro_winkler
7 ,to_char(utl_match.jaro_winkler_similarity('test', NULL),'999')
8 jaro_winkler_sim
9 from dual;
EDIT EDIT JARO_WINK JARO
---- ---- --------- ----
-1 125 .0000 0
All samples were run on Oracle 10g XE
Hope You find it usefull.
Paweł
Subscribe to:
Posts (Atom)