Monday, July 12, 2010

Beware of javascript String.replace

Here is a problem I encountered a few days ago:

What would you expect the result of the following expression be:


Well its easy to guess that it will be the string 'asomeUserInputb', and indeed it will.

however what would happen if instead 'someUserInput' we realy had some some user input?

you would expect it to return the actual user input with 'a' prefixing it and 'c' as its suffix.

I Also guess some of you knew that if instead of 'b' you used a regular expression then some tokens in the replacement string will have a special meaning as in the following table:

Characters  Replacement text
$$ $
$& The matched substring.
$‘The portion of string that precedes the matched substring.
$’The portion of string that follows the matched substring.
$nThe nth capture, where n is a single digit in the range 1 to 9 and $n is not followed by a decimal digit. If n≤m and the nth capture is undefined, use the empty String instead. If n>m, the result is implementation-defined.
$nnThe nnth capture, where nn is a two-digit decimal number in the range 01 to 99. If nn≤m and the nnth capture is undefined, use the empty String instead. If nn>m, theresult is implementation-defined.

What I did not know / expect is for this behavior to occur even when I was using a plain sting as the search value.

In my case I had to do some length based truncation of a user input string, append an ellipsis to it and inject it into a string instead of some place holder.

Template string: '<div>$userInputAfterElipsis</div>'
Original User input: 'some long string that said something and a price like 1M$ isideIt'

"luckly" for me, it turned out that the truncation logic truncated the user's input string right after the $ sign and so when I appended an &hellip; (…) to it I ended up with 'some long string that said something and a price like 1M$&helli;' as the new value for the string replacement.

'<div>$userInputAfterElipsis</div>'.replace('$userInputAfterElipsis','some long string that said something and a price like 1M$&helli;')

which returns:
'<div>some long string that said something and a price like 1M$userInputAfterElipsishelli;</div>
having replaced the $& in the new value with $userInputAfterElipsis.

so that was the problem. but how should this be resolved? and why should you care?

I will start with the second Q:
You should care because if you use JavaScript you probably also use String.replace and if you do then it might fail to do what you expect in those scenarios where the replacement string is not a fixed string that you know.

As for what can/should be done, well the first thing that comes to mind is that you need to escape those $'s
so how about:
This will not work for two reasons, the first on is that the $ replacement will only replace the first occurrence of a $ sign in the replacement string and the second is that here also the $$ will be interpreted as a single $ so what we do need is actually:
now this will work BUT, IMHO, performance-wise it is not such a great solution.

what I ended up doing was using the following:
this is o.k. as long as you want all the occurrences of original to be replace with replacement. If you want to replace only the first one you should use:

As always I hope this will help some one out there in the sea of bits and bytes...

No comments:

Clicky Web Analytics