Friday, March 30, 2012

Replacing Multiple Strings Using the REPLACE Function

I'm would like to replace all occurrences of "99999" and "-99999" with "" in a column using SSIS. I can use the REPLACE function in a Derived Column to replace one of those strings, for example: REPLACE(mycolumn,"99999",""). Or to replace both I could use REPLACE(REPLACE(mycolumn,"-99999",""),"99999",""). This seems kind of cumbersome and would get very complicated if I were replacing more strings with "". I'm guessing there is a better way. Can anyone help me out?

Thanks,
Ridium

Ridium wrote:

I'm would like to replace all occurrences of "99999" and "-99999" with "" in a column using SSIS. I can use the REPLACE function in a Derived Column to replace one of those strings, for example: REPLACE(mycolumn,"99999",""). Or to replace both I could use REPLACE(REPLACE(mycolumn,"-99999",""),"99999",""). This seems kind of cumbersome and would get very complicated if I were replacing more strings with "". I'm guessing there is a better way. Can anyone help me out?

Thanks,
Ridium

There isn't a simpler way, that is exactly how you should do it. Its simple and it works and I don't think its cumbersome at all. Just my opinion.

What syntax do you envisage for a REPLACE function that allows you to replace multiple strings? Also, in your example given above I can envisage it replacing the "99999" part of "-99999" and you being left with "-" which isn't what you want.

-Jamie

|||I have about 20 non-printable characters I want to scrub from my data. I guess I will need to string 20 REPLACE functions together unless someone has a better idea.

Thanks for you help,
Ridium
|||

Ridium wrote:

I have about 20 non-printable characters I want to scrub from my data. I guess I will need to string 20 REPLACE functions together unless someone has a better idea.

Thanks for you help,
Ridium

Yeah, I think that's what you'll have to do. Why is that such a problem? I honestly can't fathom how this could be less (in your words) "cumbersome". I'm interested in any ideas you may have.

Regards

Jamie

|||

Jamie Thomson wrote:

Ridium wrote:

I have about 20 non-printable characters I want to scrub from my data. I guess I will need to string 20 REPLACE functions together unless someone has a better idea.

Thanks for you help,
Ridium

Yeah, I think that's what you'll have to do. Why is that such a problem? I honestly can't fathom how this could be less (in your words) "cumbersome". I'm interested in any ideas you may have.

Regards

Jamie

I discovered a more convenient method. Instead of using all those REPLACE functions, just use an expression:
mycolumn == "99999" || mycolumn == "-99999" ? NULL(DT_DECIMAL,2) : (DT_CY)mycolumn

I can just add an additional "or" operation for each new term I want to search for. This is also less prone to errors.

Ridium
|||

Ridium wrote:

Jamie Thomson wrote:

Ridium wrote:

I have about 20 non-printable characters I want to scrub from my data. I guess I will need to string 20 REPLACE functions together unless someone has a better idea.

Thanks for you help,
Ridium

Yeah, I think that's what you'll have to do. Why is that such a problem? I honestly can't fathom how this could be less (in your words) "cumbersome". I'm interested in any ideas you may have.

Regards

Jamie

I discovered a more convenient method. Instead of using all those REPLACE functions, just use an expression:
mycolumn == "99999" || mycolumn == "-99999" ? NULL(DT_DECIMAL,2) : (DT_CY)mycolumn

I can just add an additional "or" operation for each new term I want to search for. This is also less prone to errors.

Ridium

OK, glad you found something that you're happy with. A word of warning though, use parentheses around the first argument to the conditional operator or else you could find yourself in a world of hurt.

Why do you think that is less prone to errors? And when you say "just use an expression", why would using the REPLACE function not constitute using an expression?

Regards

-Jamie

|||I meant use a conditional expression. It seems pretty obvious that something like this:

REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(A,B,C),D,E),F,G),H,I),J,K)

Is a lot more complicated than my solution. Imagine trying to include 20 or 30 functions. This is much harder to read and understand. You could easily lose sight of what parameter goes to what REPLACE function which might cause an error.

Ridium
|||

Ridium wrote:

I meant use a conditional expression. It seems pretty obvious that something like this:

REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(A,B,C),D,E),F,G),H,I),J,K)

Is a lot more complicated than my solution. Imagine trying to include 20 or 30 functions. This is much harder to read and understand. You could easily lose sight of what parameter goes to what REPLACE function which might cause an error.

Ridium

OK fair enough, can't argue with that. Note what I said about parentheses though!

-Jamie

No comments:

Post a Comment