Saturday, February 25, 2012

scramble ssn with sql server

Hi,
I'm trying to scramble the ssn#s within our database. I would need a 9
digit number to be converted into another 9 digit number in our dev
database.
Example #1:
ssn: 123456789 converts to 987654321
Also there is a catch, there is a possibility that there could be
duplicate ssn within a table due to bad data. I was the 2 records with
the same actual ssn# to be converted into the same scrambled ssn# using
sql server (so that the scrambled ssn#s match) for this issue.
Is there a way to do this?
Thanks
:DI hope you are not thinking about reversing the SSN in order to protect the
data? Any third grader could figure that out and would know all the SSNs.
Your best bet would be to replace them with actual random IDs. You might
try this approach...
Assuming you have SSN stored in one table, lets call it CUSTOMERS...
create a new table SSN_MASK. Insert into this table every unique SSN, along
with an identity column to give them a unique value. Then use the values in
this table to update SSN in the real table. Lastly, drop the SSN_MAsk table
(or at least truncate it) so no one can match the bogus SSNs to the real
thing.
create table SSN_MASK
(
NEW_SSN integer identity primary key not null,
OLD_SSN varchar(9) -- not sure how you defined it
);
insert into SSN_MASK (OLD_SSN) (select unique SSN from CUSTOMERS);
update CUSTOMERS
set SSN = (select
right('000000000' + cast(a.NEW_SSN as varchar(9)),9)
-- make sure the new SSN is 9 characters
from SSN_MASK as a
where a.OLD_SSN = CUSTOMERS.SSN);
drop table SSN_MASK;
<dmalhotr2001@.yahoo.com> wrote in message
news:1158351116.321489.122450@.i3g2000cwc.googlegroups.com...
> Hi,
> I'm trying to scramble the ssn#s within our database. I would need a 9
> digit number to be converted into another 9 digit number in our dev
> database.
> Example #1:
> ssn: 123456789 converts to 987654321
>
> Also there is a catch, there is a possibility that there could be
> duplicate ssn within a table due to bad data. I was the 2 records with
> the same actual ssn# to be converted into the same scrambled ssn# using
> sql server (so that the scrambled ssn#s match) for this issue.
> Is there a way to do this?
> Thanks
> :D
>|||>> create table SSN_MASK
(EW_SSN integer identity primary key not null,
OLD_SSN varchar(9)); -- not sure how you defined it <<
SSN is always CHAR(9). Simply numbering it with a proprietary feature
is not that good; you have destroyed the data type. Here is one we
used in procedural code with arrays.
CREATE TABLE SSN_masks
(shift_id INTEGER NOT NULL PRIMARY KEY -- cols 8 & 9
CHECK (shift_id BETWEEN 00 AND 99),
col1 INTEGER NOT NULL,
col2 INTEGER NOT NULL,
col3 INTEGER NOT NULL,
col4 INTEGER NOT NULL,
col5 INTEGER NOT NULL,
col6 INTEGER NOT NULL,
col7 INTEGER NOT NULL);
You take the last two digits of the SSN (fastest changing values) and
look up a vector that tells you how to shift the remaining seven
digits.
shift(x) = ABS((x + col_n) % 11-10)
Since you have 100 different masks, the data gets scrambled pretty good
and it destroys the area digits which would otherwise repeat and give
information about the population geographic distribution. This is also
reversible because we preserve the shift identifier in the output;
shift all of them and it is a bitch to unscramble without the shift
array. Then you change the array when you get the next sample.

No comments:

Post a Comment