word boundaries for regular expressions

This is a discussion on word boundaries for regular expressions within the RUBY forums in Programming Languages category; Hi did a search for word boundaries but didnt quite find what i was looking for. If i have strings containing products and model numbers e.g. "JP-ATH Headphones JP" and I want to remove the last JP but not the one in the modle number how do i go about it, i tried string.gsub(/\bJP\b/, '') but it removes both. I guess the hypen in the model number doesnt count as a word letter so it gets knocked off. am i doing something wrong here? -- Posted via http://www.ruby-forum.com/ ....

Go Back   Application Development Forum > Programming Languages > RUBY

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-26-2008, 03:33 AM
Adam Akhtar
Guest
 
Default word boundaries for regular expressions

Hi did a search for word boundaries but didnt quite find what i was
looking for.

If i have strings containing products and model numbers

e.g.
"JP-ATH Headphones JP"

and I want to remove the last JP but not the one in the modle number how
do i go about it,

i tried

string.gsub(/\bJP\b/, '')
but it removes both.
I guess the hypen in the model number doesnt count as a word letter so
it gets knocked off.

am i doing something wrong here?
--
Posted via http://www.ruby-forum.com/.

Reply With Quote
  #2  
Old 08-26-2008, 03:38 AM
Stefano Crocco
Guest
 
Default Re: word boundaries for regular expressions

On Tuesday 26 August 2008, Adam Akhtar wrote:
> Hi did a search for word boundaries but didnt quite find what i was
> looking for.
>
> If i have strings containing products and model numbers
>
> e.g.
> "JP-ATH Headphones JP"
>
> and I want to remove the last JP but not the one in the modle number how
> do i go about it,
>
> i tried
>
> string.gsub(/\bJP\b/, '')
> but it removes both.
> I guess the hypen in the model number doesnt count as a word letter so
> it gets knocked off.
>
> am i doing something wrong here?


You can replace the first \b with \s, which only matches spaces:

string.gsub(/\sJP\b/, '')

I hope this helps

Stefano




Reply With Quote
  #3  
Old 08-26-2008, 05:32 AM
Michael Morin
Guest
 
Default Re: word boundaries for regular expressions

Adam Akhtar wrote:
> Hi did a search for word boundaries but didnt quite find what i was
> looking for.
>
> If i have strings containing products and model numbers
>
> e.g.
> "JP-ATH Headphones JP"
>
> and I want to remove the last JP but not the one in the modle number how
> do i go about it,
>
> i tried
>
> string.gsub(/\bJP\b/, '')
> but it removes both.
> I guess the hypen in the model number doesnt count as a word letter so
> it gets knocked off.
>
> am i doing something wrong here?


If the spaces are consistent, you can do something like this

"JP-ATH Headphones JP".split(/\s+/)[0..-2].join(" ")

or this if they're not.

"JP-ATH Headphones JP".sub(/\s+\w+$/,'')

The advantage of the top one is you can remove something out of the
middle of the string if necessary. The bottom one is probably faster
and generally makes more sense.

--
Michael Morin
Guide to Ruby
http://ruby.about.com/
Become an About.com Guide: beaguide.about.com
About.com is part of the New York Times Company

Reply With Quote
  #4  
Old 08-26-2008, 12:22 PM
Robert Klemme
Guest
 
Default Re: word boundaries for regular expressions

2008/8/26 Adam Akhtar <adamtemporary@gmail.com>:
> Hi did a search for word boundaries but didnt quite find what i was
> looking for.
>
> If i have strings containing products and model numbers
>
> e.g.
> "JP-ATH Headphones JP"
>
> and I want to remove the last JP but not the one in the modle number how
> do i go about it,
>
> i tried
>
> string.gsub(/\bJP\b/, '')
> but it removes both.
> I guess the hypen in the model number doesnt count as a word letter so
> it gets knocked off.


Yep, that sums it up pretty well.

> am i doing something wrong here?


Obviously, since your results do not match your expectations / requirements. :-)

You could use lookahead

irb(main):002:0> "JP-ATH Headphones JP".gsub /\bJP\b(?=\s|$)/, 'XXX'
=> "JP-ATH Headphones XXX"

It all depends on what other occurrences you have and which of them
you want to match.

Kind regards

robert

--
use.inject do |as, often| as.you_can - without end

Reply With Quote
  #5  
Old 08-28-2008, 01:07 AM
Adam Akhtar
Guest
 
Default Re: word boundaries for regular expressions

Thanks everyone. Yes the strings to match vary a lot in terms of
positiong. Some dont have model numbers, some do, some dont have jp some
do. Ive know of look ahead but never used it before. ill give that a
shot.

Thanks!

adam
--
Posted via http://www.ruby-forum.com/.

Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 08:51 PM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.