Page 1 of 1

SOUNDEX function in DB2.

Posted: Tue Nov 25, 2014 8:03 pm
by dipakchandak
Hi Anuj,

I could not find where to post questions?

I have a query. I have a string in DB2 column which contains address something like "312 MOUNT MARY ROAD BOXTER LVL JPN". Now I need to pass this db2 column thru a SOUNDEX option to generate a key out of it and compare the similar key generated for some other similar string. Is this possible? How can I use the SOUNDEX function of DB2 to achieve this?

Thanks,
Dipak

SOUNDEX function in DB2.

Posted: Tue Nov 25, 2014 8:32 pm
by zprogrammer
Hi Dipak,

Welcome to the forum!!

You could post your question in DB2 forum..Some global moderator will re-route there shortly

SOUNDEX function in DB2

Posted: Tue Nov 25, 2014 9:12 pm
by zprogrammer
To understand your question : You wanted to find similar strings similar to "312 MOUNT MARY ROAD BOXTER LVL JPN" ?

SOUNDEX could be used in cases where you know how the sounds work but you don't know how exactly to spell it

Code: Select all

   SELECT EMPNO, LASTNAME
      FROM DSN910.EMPLOYEE
      WHERE SOUNDEX(LASTNAME) = SOUNDEX('Loucesy');
Output:

Code: Select all

000110 LUCCHESSI;
Also check here

Re: SOUNDEX function in DB2.

Posted: Tue Nov 25, 2014 10:13 pm
by Anuj Dhawan
Hi Dipak,

As Pandora-Box has suggested, I split your topic and moved it to the DB2 part of the Forum.

Re: SOUNDEX function in DB2.

Posted: Tue Nov 25, 2014 10:21 pm
by Anuj Dhawan
We'd need more detailed explanation of what you want. As PB talked about SOUNDEX, below is another example on similar lines:

Code: Select all

SELECT FIRSTNAME
FROM   EMPLOYEE
WHERE  SOUNDEX(FIRSTNAME) = SOUNDEX(’Dipak’);
This query would return not only employees with a first name as “Dipak,” but also anything that sounds like Dipak - such as dipak, Deepak. If you want to catch the difference also - look for the DIFFERENCE function related to SOUNDEX. DIFFERENCE returns a value from 0 to 4 where the number represents the difference between the sounds of two strings based on applying the SOUNDEX function to the strings. The higher the value, the closer the two strings are to sounding alike.

Hope this helps but if it does not, please elaborate.

Re: SOUNDEX function in DB2.

Posted: Wed Nov 26, 2014 12:27 pm
by dipakchandak
Thanks for your explanation. I know how SOUNDEX works. Being new to the group and also in a hurry, I probably could not put my requirement in correct way. Please see below:

I have address field in one of the columns of a DB2 table. I need to compare this address to the address field that I'm getting from a file. The catch here is the address field might have multiple words and I need to get the SOUNDEX for each word seperately. (Ex- SOUNDEX('IMB MAINFRAME GROUP') does not yeild the correct result; instead we need to do SOUNDEX('IBM'), SOUNDEX('MAINFRAME'), SOUNDEX('GROUP') to get the correct soundex for entire string.)

Now, here is what I plan to do:

1. Extract the column from DB2.
2. Get the soundex for each word in the address field? Need help on how to get it for each word within the string.
3. Append the output of soundex (output of step2) to get a key.
4. create the similar key from the file address field.
5. If both keys are matching, I'm good else write it as a different address (or address is changed for a record)

Does this sound ok to you or you suggest a better way to compare the two address fields and report the differences in a file? All this I plan to do with just a JCL/DB2 and without writing a cobol program. What do you suggest?

Thanks again for your quick help on this.

Regards,
Dipak

Re: SOUNDEX function in DB2.

Posted: Wed Nov 26, 2014 6:07 pm
by zprogrammer
The catch here is the address field might have multiple words
Can you explain this with example data in table and file?

Re: SOUNDEX function in DB2.

Posted: Wed Nov 26, 2014 6:19 pm
by enrico-sorichetti
pretty ugly approach IMO

in most countries the POSTAL AUTHORITY makes available
the NORMALISED street directory
and most organisations make use of it in order to check and store in the databases the proper addresses

if the application has normalised ( even in a nonstandard way ) the addresses by splitting them in a few fields
then You could apply the SOUNDEX algorithm only to the true name regardless of the number of words

a structured splitting could be
type ( avenue, road , plaza, ... )
name ....
EAST/WEST/NORTH/SOUTH
....

Re: SOUNDEX function in DB2.

Posted: Wed Nov 26, 2014 7:07 pm
by dipakchandak
The catch here is the address field might have multiple words

Can you explain this with example data in table and file?

PB: Table column would have data like below:

123 HANGING GARDEN ROSEWOOD BLVD KT
544 ROSEBERRY STREET XAVIOUR CT
399 CANTT INN PALM STREET DNT

where as the file that comes in daily will have similar data but with littel difference in the spellings like below:

123 HANGIN GRDN ROSEWUD BLVD KT (this sounds matching)
544 ROSECHERRY STREET XAVIOUR CT (this should not be matching as it completely changed the area sound)
399 CANTT INN PALM ST DNT (no idea if STREET and ST will be same phonics)

(everything is brackets above is my assumptions)...My taks is to compare these address and find out if there are differences. I hope it helps.

Re: SOUNDEX function in DB2.

Posted: Wed Nov 26, 2014 7:11 pm
by DB2 Guy
This
dipakchandak wrote:1. Extract the column from DB2.
How do you extract - did you mean that you unload it? If yes, it all becomes very simple. But this statement of your
All this I plan to do with just a JCL/DB2 and without writing a cobol program. What do you suggest?
changes all the game. I don't see a better way than writing a small COBOL program, as you've also mentioned.

Re: SOUNDEX function in DB2.

Posted: Thu Nov 27, 2014 8:57 am
by Chandan Yadav
dipakchandak wrote: All this I plan to do with just a JCL/DB2 and without writing a cobol program. What do you suggest?
I am just wondering how you will do this only with JCL/DB2 and no cobol. You will need to write a program to read file and all other processing?

Also for getting different fieldsof the address field you may use the String function but yes again question still remaining open how you will delimit because for few fields Space might be the valid part

Correct me if I am missing anything here

Reagrds,
Chandan

Re: SOUNDEX function in DB2.

Posted: Thu Nov 27, 2014 8:14 pm
by Anuj Dhawan
Chandan has a valid point - do you intend to load data from file to a DB2 table? If not, how do you "read" it to compare?

Re: SOUNDEX function in DB2.

Posted: Thu Nov 27, 2014 10:37 pm
by zprogrammer
Hi,

Also it is worthy to check the consistency of file with the table and also try to understand how the input file is created. That might help us to look at things from a different perspective

Re: SOUNDEX function in DB2.

Posted: Tue Dec 02, 2014 7:21 pm
by dipakchandak
Hi, Thanks for your inputs. Do you know how to write FUNCTIONS in DB2 and deploy it in DB2? I need to write a function and implement it in prod. once that is implemented, My ab-Initio team will read the columns extract along with this Function and work on the report creation.

My job: To write a DB2 function which will accept a string input and give the key generated out of SOUNDEX for each word from the string?

Thanks,
Dipak

Re: SOUNDEX function in DB2.

Posted: Tue Dec 02, 2014 7:25 pm
by zprogrammer
Hi Dipak,

It is always better to create a new topic for new queries. Admin or Mr green will create a new thread for this question

Re: SOUNDEX function in DB2.

Posted: Tue Dec 02, 2014 7:43 pm
by DB2 Guy
I think the latest follow up is in connection with the previous question, in that case, do you think that we should still separate this out from the original thread?

Re: SOUNDEX function in DB2.

Posted: Wed Dec 03, 2014 2:50 pm
by Anuj Dhawan
dipakchandak wrote:Do you know how to write FUNCTIONS in DB2 and deploy it in DB2? I need to write a function and implement it in prod. once that is implemented, My ab-Initio team will read the columns extract along with this Function and work on the report creation.

My job: To write a DB2 function which will accept a string input and give the key generated out of SOUNDEX for each word from the string?
I'm not sure if I've understood the bold text above clearly enough (from the discussions so far) to help you better. A programming approach sounds much better from first time write and maintenance per se.

However, from the first line above, you seem to talk about UDFs (User defined Functions) in DB2, yes?

If yes - what version of DB2 you are using at your shop? There had been changes the way UDFs can be used in recent releases of DB2 and that should be considered.

Also, if you've already looked at - please see the chapter 5 - "User-defined functions (UDF)" in the red-book "DB2 for z/OS Application Programming Topics" here: http://www.redbooks.ibm.com/redbooks/pdfs/sg246300.pdf

These pdfs also have some good references on UDFs:

http://www.trivadis.com/uploads/tx_caba ... 5EN_01.pdf
http://www.tridug.org/wp-content/upload ... -final.pdf

Hope this helps.