Saturday, February 25, 2012

Scrambled data in some fields

We have recently discovered that several sending systems (all using the same subroutine) have been scrambling some of the data sent to an MS SQL database.

I'm trying to figure out how to identify the scrambled records.

We know that records with zero in the 4th, 6th and 7th positions of their SSNs, had the SSN rearranged. Instead of the SSN in order (123456789) it is now (467512389).

The SSN is used as a subsidiary ID field in the system (let's call it the SID field.) Since some people don't have SSNs, an alternate number, which may begin with three zeros, is used.

I can select off all SIDs beginning with 3 zeros, but some of those will be legitimate records.

I can also select off all SIDs with zeroes in positions 4, 7 and 8, but only SOME systems were making the mistaking.

I can't figure out how to make a list of SIDs with 4, 7, and 8 for which a corresponding SID with 3 zeroes in front exists. This list would be the first we examine. (We will have to check all '000' SIDs, since it's possible that only 'bad' SIDs were created for some people, instead of both 'good' and 'bad' SIDs.)

Can someone help a novice SQL programmer with this (to me) perplexing problem?If I understand you, you need to try the following

INSERT INTO PossibleErrors
( SID )
SELECT SID
FROM Table
WHERE SUBSTRING(convert(varchar,SID),1,3) = '000'
AND SUBSTRING(convert(varchar,SID),4,1) = '0'
AND SUBSTRING(convert(varchar,SID),7,1) = '0'
AND SUBSTRING(convert(varchar,SID),8,1) = '0'

No comments:

Post a Comment