Thursday, January 23, 2014

Informix SQL Bug?

A statement like

select * from emplyr where emplyr_name MATCHES "*MC[ ]*D*"

should match any emplyr_name that has MCD in the middle of it, as well as MC D, because [ ]* is supposed to be zero or more of the characters inside the brackets.)  At least, that is what the Informix website would indicate.  However: it does not.  It only returns emplyr_name rows with MC D in them -- apparently [ ]* turns into one or more matches.  Now, if I do this:

select * from emplyr where emplyr_name MATCHES "*MC[ A-z0-9']*D*"

That works pretty much as expected.  My guess is that brackets require more than one character inside of them for zero or more characters to match.  It would be nice if I could figure out a way to enter an escape sequence that says "everything that isn't a blank" instead of A-z0-9'

I tried:

select * from emplyr where emplyr_name MATCHES "*MC *D*"

But that returns only MC D and not MCD rows.

It is possible to do a UNION like this:

select * from emplyr where emplyr_name MATCHES "*MCD*"
UNION
select * from emplyr where emplyr_name MATCHES "*MC D*"

but this complicates the matter, because the select statement is inside an SPL, and the SPL would have to decide whether to do a single select or a union of two selects depending on whether the search string contains blanks.  That's ugly.

UPDATE: Ugly solution, but I did it like this:

-- This is ugly.  We need to do a UNION of SELECT statements in the event
-- that there are any blanks in the search strings to create wild carding for
-- zero or more blanks.  To do this, we find out if any of the search strings
-- contain blanks, and if so, make a version with the blanks removed.  If
-- the blanks removed version is different from the original, do a UNION of
-- SELECTS with both versions of the search string.  However: there are
-- several different strings that could contain blanks.  Rather than create
-- multiple FOREACH loops, it is simpler to UNION these together, regardless
-- of whether the blank replacement did anything.
LET l_noblank_emplyr_name = REPLACE(i_emplyr_name, " ", "");
LET l_noblank_addr = REPLACE(i_addr, " ", "");
LET l_noblank_city = REPLACE(i_city, " ", "");
        FOREACH
            SELECT
emplyr_id,
st,
emplyr_name,
addr_long,
city,
zip_cd,
area_cd,
ph_num[1,3],
ph_num[4,7],
updt_usr_id,
updt_dt
INTO
o_emplyr_id,
o_st,
o_emplyr_name,
o_addr_long,
o_city,
o_zip_cd,
o_area_cd,
o_ph_exchg,
o_ph_num,
o_updt_usr_id,
o_updt_dt
FROM emplyr
WHERE emplyr_name MATCHES i_emplyr_name
 AND (addr_long MATCHES i_addr OR addr_long IS NULL)
 AND (city MATCHES i_city OR city IS NULL)
                 AND (st MATCHES i_st OR st IS NULL)
   AND (area_cd MATCHES i_areacode OR area_cd IS NULL)
                          AND (ph_num MATCHES i_phonenumber OR ph_num IS NULL)
UNION
            SELECT
emplyr_id,
st,
emplyr_name,
addr_long,
city,
zip_cd,
area_cd,
ph_num[1,3],
ph_num[4,7],
updt_usr_id,
updt_dt
FROM emplyr
WHERE emplyr_name MATCHES l_noblank_emplyr_name
 AND (addr_long MATCHES l_noblank_addr OR addr_long IS NULL)
 AND (city MATCHES l_noblank_city OR city IS NULL)
                 AND (st MATCHES i_st OR st IS NULL)
   AND (area_cd MATCHES i_areacode OR area_cd IS NULL)
                          AND (ph_num MATCHES i_phonenumber OR ph_num IS NULL)
ORDER BY emplyr_name

Note that inside an SPL, you can UNION two selects, but only the first SELECT can have an INTO clause, and only the last SELECT can have an ORDER BY clause.  Ugly squared.

1 comment:

Hal Duston said...

Re: your final sentence:

The INTO and ORDER BY clauses apply to the entire statement SELECT ... UNION SELECT ...

The INTO clause has to go somewhere, and the choice of first SELECT vs. second SELECT (third, etc) is arbitrary, yes, but it makes a sort of sense.

As for the ORDER BY clause, think of it like this:

(
SELECT ...
UNION
SELECT ...
) ORDER BY ...

N.B. I have only 11 months of Informix experience, but nearly 20 years of Oracle experience.