On a recent project, I needed to build a regular expression to read what are known as single line comments. You know the type in C-based languages:
// this is a basic comment
My source for my original expression was slightly misleading (as it was the one expression I did not personally write, and in haste did not bother to read carefully). Of course, in hindsight, I realize that wasn’t going to be enough.
In my system, my regular expression always needed to match from the beginning. This is what I came up with: ^(//)[^\n\r]*[\n\r]
. My explanation:
- always starting with
//
- zero or more non-newline characters,
[^\n\r]*
- some ending termination in the form of a newline character,
[\n\r]
This works quite well to find simple single line comment in code, but it also may interfere eventually with reading a block comment beginning marker, so there might need to be revisions.
I’m not a regular expression master, I leave that to Sam.
Curious… May be showing ignorance here, but why can’t you just use something as trivial as
^//.*$
if all you’re trying to do is match single line comments?
You might be right. I was working with the regex tools in C++, and in the methods we were given, I don’t think $ functioned the same way it works elsewhere.