Simple regex help

Can some regex guru tell me the magic incantation to match this situation -

  • find the word happy

  • ignore unhappy

  • match happy starting or ending with anything else

examples which should match -

###happy###
– happy –
happy

I’m trying to search through some files in linux using grep, but I can’t figure out how to construct the regex to do that.

Regards,
Dr. A>

the 1st condition looks like it’s already found by the 3rd, so basicly you want:

pseudo regex (i don’t know regex…)
happy” minus “unhappy”

I’m confused. “happy” minus “unhappy” doesn’t look like a regular expression (regex)

Some examples of a regular expression would be -

[01234] which would match a single number between 0 - 4
[0-4] the same as above
\bword\b matches word with nothing next to it, ie it can’t be in the middle of other text

Thats about the sum total of my regex knowledge, hence my please for help.

Dr. A>

there is no “not” operator in standard regex… I would suggest to chain two greps - the last one with “-v” for inverted matching:


echo -e "I seem unhappy\nbut am really happy" | grep 'happy' | grep -v 'unhappy'

results in


but am really happy

if you are looking for something more jflex-like it would be something like


TK = "happy"
%%
// first so the lexer looks for this before looking for ~happy~
"unhappy" {
    ;// nothing here so we ignore it
}

.*{TK}.* {
    // here your lexer code, f.e. System.out.println(yytext());
}

90% inacurate since my flex skills are disapearing like sand in the wind.

Hi

I’ve your using grep like you say you are, then cylab wins the cookie.

grep happy <file list> | grep -v unhappy

Endolf

\b([^ ]([^u]n|u[^n]|[^un][^un]))?happy[^ ]\b

\b A word boundary
[^abc] Any character except a, b, or c (negation)
X* X, zero or more times
X|Y Either X or Y
(X) X, as a capturing group
X? X, once or not at all

http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html

With such expressions it’s very tempting to just String.indexOf your way out :-X

Regex may be very powerful, but designing, debugging and maintaining of any non-trivial pattern is not worth it IMO (unless it’s an exercise)

it’s with non-trivial patterns where regexp are usefull, with things like the “happy” thing it’s like using a cannon to kill a flea.

Regexps are fantastic. One of the most useful tools we have, but pretty much a programming language in their own right.

This isn’t a java project its so I can search through files on the command line and figure out which ones I need to examine.

I think cylab has the solution for me. I can remember it for other cases!

Much thanks to all for their input.

Dr. A>

PS - While I understand how powerful regex-s can be, I think they are evil, just like perl!! </troll bait>

Ah, young padawan, once I was like you. They are not evil, though they are subtle and quick to anger. The real problem with them is that they are powerful and if you are weak and tempted by the dark side you could easily be consumed by their greedy evaluation. If you are afraid of them, you aren’t ready to face them yet. Don’t worry about it, they will still be there when you are ready.

Perl though, that is evil…

Truly, sir, that was an admirable reply. You are indeed worthy of the epithet “ninja”.

Whether you will be chosen by the Java Core though, that is not for mortals to decide…

(Yes, I’m still curious)

Someone needs to devise a verbose regex syntax.

The logic behind construction of an expression is fundamentally less complex than programming, only the obtuse syntax scares people off (and rightly so IMO).
Give me an eclipse GUI plugin for building & maintaining them, and I might use them for more than trivial string spliting.

Breakfast you are truly a Phoentic Warrior!

I label regex-s as evil, much the same way I label lex and yacc. Everytime I try to learn about them, the description is just as confusing as the syntax. I would learn far better with a bunch of examples and their results. If I could just find that mixed in with the traditional explanations, I’d probably avoid the hate completely.

I recently used the Spirit library from Boost and found it to be excellent. While not everything was intuitive to me, creating parsers and grammer rules just made sense. Some day my Jedi powers will be strong enough to bring balance to my programming. Until then, I’ll keep using my Goto statements. :slight_smile:

Cheers,
Dr. A>