Saturday, February 25, 2012

Scrambling Data in a table

We have a bunch of data that includes things like DNS names and host names. For testing purposes, we want to give people access to our data but we do not want them to see the real names of the items I mentioned prior. What I would like to do is a letter replacement for each character in the name, thus A could become T, B could become U, etc. It's not a big deal if the end user can unscrable these, they probably won't anyways and it's not a big deal if they do, our client just wants something to make it not so obvious.

Is this possible to do through a SQL query and if so, can you give my pointers on how to accomplish this?

Otherwise I will probably use a VB solution, but it would be similar if I could just write a SQL query that I could send to anyone that needs to do this.

you can use SQL to replace things in select statements

read this :http://www.java2s.com/Code/SQL/Function/CallREPLACEfunctioninselectclause.htm

hth,
mcm

|||

U can use Replace function

Thank u

Baba

Please remember to click "Mark as Answer" on this post if it helped you.

|||

You can create an SP which applies some encoding logic over the supplied value and one SP which does exactly the reverse of it. The encoding and decoding logic can absolutely be your ( which ever way you like it to be ). Generally you can play with the ascii value of the character. One caution that should be exercised is that your encode and decode logic must be strong and tested enough for you can't bare wrong values when encoding or decoding.

Hope this will help.

|||

I see how I can use replace to replace one letter, but how would I go about replacing all the letters? Some of my tables have thousands of records, if I have to run a select and then update query for every letter, it's going to take a LONG time.

|||

Hi,

but how would I go about replacing all the letters? Some of my tables have thousands of records, if I have to run a select and then update query for every letter, it's going to take a LONG time.

It seems that you have misunderstood the community members' suggestions. What you want to do is just to hide the real information since you don't want the information to be seen by the users. So we replacing all the records in database would not be a good practice, and actually it's not necessary. As mentioned by others, you have two ways to achieve that.

First, write the "replace" logic in your application level, not in the database level. When you have retrieve a real data from the database, you should replace the specific value against the data before displaying.

Another way is, you can create a stored procedure, after you have select the result set, just use SELECT REPLACE (Ep1,Ep2,Ep3) to replace the char in the result.

As for the speed, the replace operation against a variable would not take much time.

Thanks.

|||

I could do this in the application level, but i've been asked to do it in the database, so that's what I'm trying to do.

And I want to replace the data in the database... I see how the replace function works to replace one character at a time, but when you run your select query, you seem to only have the option to replace one letter at a time. I need to replace 26 letters and 10 numbers, which is what I'm not sure how to do.

SELECT REPLACE(name, "a", "x")

That would replace one character... I have to do it 35 more times for all the different letters and numbers, and then I have to acctually update the database with the new values. And then I have to do it on multiple fields too.

|||

UPDATE YourTable

SET YourField=REPLACE(REPLACE(REPLACE(YourField,"a","x"),"b","y"),"c","z")

Of course, that'll get really nasty with 36 of them, and you will run into problems because if you first replace a with x, when you get to replacing x with whatever, now you are replacing the old x and the new x with something. I would suggest writing a user defined function (UDF) to do the conversion for you. Then you can just:

UPDATE YourTable

SET YourField=dbo.Scramble(YourField)

|||

Kriswd40:

It's not a big deal if the end user can unscrable these, they probably won't anyways and it's not a big deal if they do, our client just wants something to make it not so obvious.

With due respect to your client, that makes no sense at all. Either you hide the information or you don't. Making it "not obvious" is ridiculous. And what's the point of making it available in develpment if they can't read it, anyway?

|||

It's not all fields that I'm changing, just IP Addresses, DNS entries, serial numbers, ect so the data is still very usable. It's not live data, periodically we are given a backup of the data we can use for developement but the client no longer wants us to have access to some of the sensative information in this data. The best thing to do would be to completely encrypt/randomize these fields so they cannot be recovered, but at this time we do nothing so at least this is a step in the right direction.

No comments:

Post a Comment