'match regular expression' or 'if' statement?
Moderators: Moderators for English X Forum, Scripting / Modding Moderators
-
- Posts: 269
- Joined: Sat, 20. May 06, 14:55
'match regular expression' or 'if' statement?
Hi all, I've got a question about the 'match regular expression' code (General Commands -> Strings -> <RetVar/IF> match regular expression: <Var/String> to string <Var/String>).
It seems to be the same as an (==) if statement, so why did egosoft implant it then? I think I'm missing something here, so if you can say what I'd like to know. The only obvious difference I see is the fact that one is designed to compare strings and the other anything that can be compared. The only thing I can think of is that it has better performance then a normal if statement (faster/more efficient). If you know the answer I'd like to know and thanks in advance
Edit: I think I figured out what it does. You can compare a word to other words/phrases, so it returns true when you compare "car" to "cartoon" but false the other way around. The only thing to watch out for is that it is case sensitive, so "Car" and "car" return false.
It seems to be the same as an (==) if statement, so why did egosoft implant it then? I think I'm missing something here, so if you can say what I'd like to know. The only obvious difference I see is the fact that one is designed to compare strings and the other anything that can be compared. The only thing I can think of is that it has better performance then a normal if statement (faster/more efficient). If you know the answer I'd like to know and thanks in advance
Edit: I think I figured out what it does. You can compare a word to other words/phrases, so it returns true when you compare "car" to "cartoon" but false the other way around. The only thing to watch out for is that it is case sensitive, so "Car" and "car" return false.
Hrm... interesting.. it could just be one of those "shortcuts" that they put in. Like Skip if/Skipifnot
Or readtext to array from min to max (which could've easily been done via a loop)
Have you tried it with case/special charachters?
And as to whether its more efficient, hopefully you would think that it is, but there is probably no way we'll find out, and for such a basic command, it wouldn't really use much cpu-ness
Or readtext to array from min to max (which could've easily been done via a loop)
Have you tried it with case/special charachters?
And as to whether its more efficient, hopefully you would think that it is, but there is probably no way we'll find out, and for such a basic command, it wouldn't really use much cpu-ness
- mostlikely
- Posts: 157
- Joined: Sat, 29. Nov 03, 23:20
Regular expression wiki
The short story is that regular expressions allow you to do more then a basic character for character matching of strings.
Regular expressions are basically 'patterns' in text you want to search for instead of the exact text.
For example you want to find out if a ship's name contains 'Nova'.. now you can do this by cutting up the ship's name into pieces but you could also use an regular expression like this: '<any text> Nova <any text>'.
Which would find 'Enhanced Nova' but also 'Nova prototype'.
The actual regular expression would look something like '.* Nova .*' (I don't know the exact syntax for x3 mind)
It's hard to come up with actual concrete examples where you would use it. But when you do find a use for regular expressions it usually saves lots of code and time because you can compare/manipulate strings without having to dissect them character by character.
The short story is that regular expressions allow you to do more then a basic character for character matching of strings.
Regular expressions are basically 'patterns' in text you want to search for instead of the exact text.
For example you want to find out if a ship's name contains 'Nova'.. now you can do this by cutting up the ship's name into pieces but you could also use an regular expression like this: '<any text> Nova <any text>'.
Which would find 'Enhanced Nova' but also 'Nova prototype'.
The actual regular expression would look something like '.* Nova .*' (I don't know the exact syntax for x3 mind)
It's hard to come up with actual concrete examples where you would use it. But when you do find a use for regular expressions it usually saves lots of code and time because you can compare/manipulate strings without having to dissect them character by character.
-
- Posts: 269
- Joined: Sat, 20. May 06, 14:55
yeah I tried special characters and capitals, special characters work fine, but capitals and non capitals return false ("Nova" and "nova" is false, "Nova Raider" and "Nova Vanguard" is true, like mostlikely said ). So for now I don't really see many advantages yet, as upper cases are considered different from their lower case variants, but I see it can be useful for non-user input type comparisons.
Anyway thanks guys
Anyway thanks guys
- mostlikely
- Posts: 157
- Joined: Sat, 29. Nov 03, 23:20
Keep in mind that regular expressions are not case insensitive until you tell them to. Again I don't know the exact syntax but it's also possible to use 'or' operations in regular expressions:(----____JEFF____----) wrote:but capitals and non capitals return false.
(n|N)ova = (n or N)ova
would match Nova and nova.
I can't check atm but are there perhaps any examples in the standard X3 scripts of this function? Perhaps that would help to figure it out as I'm pretty sure it 'should' be able to do what you want.
-
- Posts: 269
- Joined: Sat, 20. May 06, 14:55
mostlikely this is the syntax.
Where <RetVar/IF> can be a if/skip if statement or a variable to put the output in.
The <Var/String>'s can be a string or a variable containing a string. I've no idea if TC has some form of wild card or OR for use in this statement. It does have OR for regular if statements, but that's only for regular if statements afaik (so only like this: if $x == $y OR $z == $y). I also don't know if there are any scripts atm that use this code.
How I tested it so far is by just entering 2 strings and writing the return value to a log.
The thing I want to use this for also works fine with a if statement, I was just curious what it did/how it worked and if I could use it.
Code: Select all
<RetVar/IF> match regular expression: <Var/String> to string <Var/String>
The <Var/String>'s can be a string or a variable containing a string. I've no idea if TC has some form of wild card or OR for use in this statement. It does have OR for regular if statements, but that's only for regular if statements afaik (so only like this: if $x == $y OR $z == $y). I also don't know if there are any scripts atm that use this code.
How I tested it so far is by just entering 2 strings and writing the return value to a log.
The thing I want to use this for also works fine with a if statement, I was just curious what it did/how it worked and if I could use it.
-
- Posts: 283
- Joined: Sun, 3. Aug 08, 20:30
mostlikely meant the syntax of the regular expression which would be the first string variable.
In normal usage, a regular expression would look like "^(Falcon|Nova) . 0[1-9]". This would (say you're searching a database and not within X3) return all the variants of the Falcon and the Nova - Prototype, Raider, Sentinel, Vanguard - which had been serialized between 01 to 09. The period (.) is a wildcard that would only return if there is text between the "Falcon" or "Nova" string and the serial. The caret (^) denotes that if any text is before the "Falcon" or "Nova" string, the entry is to be ignored.
So "^(Falcon|Nova) . 0[1-9]" would return
Falcon Vanguard 02
or
Nova Nebula 04
but not
Falcon 03
or
(Blue) Nova Raider 07
_________________________
You would use something like this within a loop. Say you got an array of player owned ships within your current sector, but you only wanted to control the Novas and any variants that had been assigned a designation by you at the start of the name.
You would use "^\(.\) (.)? Nova (.)?" to narrow the array down with that regular expression.
Now, this would be rather esoteric as you would undoubtedly have assigned something useful between the beginning parentheses like a homebase or wing which you would simply use the respective commands in place of this.
However, if you were making something like a renaming script, this would be perfect for mass renames in using this command as a way to exclude certain objects.
In normal usage, a regular expression would look like "^(Falcon|Nova) . 0[1-9]". This would (say you're searching a database and not within X3) return all the variants of the Falcon and the Nova - Prototype, Raider, Sentinel, Vanguard - which had been serialized between 01 to 09. The period (.) is a wildcard that would only return if there is text between the "Falcon" or "Nova" string and the serial. The caret (^) denotes that if any text is before the "Falcon" or "Nova" string, the entry is to be ignored.
So "^(Falcon|Nova) . 0[1-9]" would return
Falcon Vanguard 02
or
Nova Nebula 04
but not
Falcon 03
or
(Blue) Nova Raider 07
_________________________
You would use something like this within a loop. Say you got an array of player owned ships within your current sector, but you only wanted to control the Novas and any variants that had been assigned a designation by you at the start of the name.
You would use "^\(.\) (.)? Nova (.)?" to narrow the array down with that regular expression.
Now, this would be rather esoteric as you would undoubtedly have assigned something useful between the beginning parentheses like a homebase or wing which you would simply use the respective commands in place of this.
However, if you were making something like a renaming script, this would be perfect for mass renames in using this command as a way to exclude certain objects.
Last edited by WindsOfBoreas on Tue, 13. Jan 09, 20:32, edited 2 times in total.
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov
-
- Posts: 269
- Joined: Sat, 20. May 06, 14:55
I'm mostly curious about how much of the theoretical syntax actually works with the script command. =)
My complete script download page. . . . . . I AM THE LAW!
There is no sense crying over every mistake. You just keep on trying till you run out of cake.
There is no sense crying over every mistake. You just keep on trying till you run out of cake.
-
- Posts: 283
- Joined: Sun, 3. Aug 08, 20:30
Most likely none of the syntax I laid out works in X3. That is all from the HTML regular expression code (though most of it does work in the rest of the RegEx engines).
I'll do some testing of it right after dinner here and get back to you.
I'll do some testing of it right after dinner here and get back to you.
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov
It's unlikely to work because I _think_ and I'm by no means certain (I'm better with Perl than "X-Scripting"), that the RegExp matching in X3 is just that, matching.However, if you were making something like a renaming script
This mean you can't assign flags (such as "i" for case-insensitive matching) or use parentheses to store matches for extraction. So you can't extract the variant type using something like:
Code: Select all
$sVariant =~ s/^.*\s*Nova\s+([a-z]+)?.*$/$1/i
Because in reality you've only got access to the "match" part of the RegExp, not the full construction.
Code: Select all
^.*\s*Nova\s+([a-z]+)?.*$
-
- Posts: 283
- Joined: Sun, 3. Aug 08, 20:30
Chealec has pointed out a great deal. X3 is more about matching than anything else, as I have found in testing this command.
So far, this is what I've found to be true:
The Regular Expression is case sensitive.
Period (.) acts as the wildcard as in normal RegEx. However, the similarities end at that. In normal RegEx, the period is used for a single character; in X3, the period is used as an unlimited number of characters including whitespace and non-alphanumeric characters.
". Nova" will return true for "Prototype Nova" or "Hey, look over there; it's a Nova" but not "Nova Vanguard".
Caret (^) acts to search for the beginning of a string. This does not search for line-breaks as in normal RegEx.
"^Nova" will return true for "Nova Vanguard" or "Novas are really, really tasty" but not "Prototype Nova".
Dollar Sign ($) acts to search for the end of a string. This does not search for the character or word before a line-break as in normal RegEx.
"Nova$" will return true for "Prototype Nova" or "The rain in Argon Prime falls mainly on the Nova" but not "Nova Vanguard".
Backslash (\) will negate the special character that follows it - caret, dollar sign, period. This differs from normal RegEx where the backslash negates the special character that precedes it.
"\^ Nova" will return true for "^ Nova" or "The rain in Argon Prime...look ^ Nova" but not "Nova".
Brackets ([, ]) create a character class for searching through a series. Such would be [0-9] or [a-d]. These are special characters only when used together.
"Nova 0[1-5]" will return true for "Nova 05" but not "Nova 06".
So far, this is what I've found to be true:
The Regular Expression is case sensitive.
Period (.) acts as the wildcard as in normal RegEx. However, the similarities end at that. In normal RegEx, the period is used for a single character; in X3, the period is used as an unlimited number of characters including whitespace and non-alphanumeric characters.
". Nova" will return true for "Prototype Nova" or "Hey, look over there; it's a Nova" but not "Nova Vanguard".
Caret (^) acts to search for the beginning of a string. This does not search for line-breaks as in normal RegEx.
"^Nova" will return true for "Nova Vanguard" or "Novas are really, really tasty" but not "Prototype Nova".
Dollar Sign ($) acts to search for the end of a string. This does not search for the character or word before a line-break as in normal RegEx.
"Nova$" will return true for "Prototype Nova" or "The rain in Argon Prime falls mainly on the Nova" but not "Nova Vanguard".
Backslash (\) will negate the special character that follows it - caret, dollar sign, period. This differs from normal RegEx where the backslash negates the special character that precedes it.
"\^ Nova" will return true for "^ Nova" or "The rain in Argon Prime...look ^ Nova" but not "Nova".
Brackets ([, ]) create a character class for searching through a series. Such would be [0-9] or [a-d]. These are special characters only when used together.
"Nova 0[1-5]" will return true for "Nova 05" but not "Nova 06".
- Caret (^) when used at the beginning of a character class ( [^a-d] ) will turn the character class into the opposite.
"[^abc] Nova" will return true for "D Nova" or "C Nova" but not "a Nova".
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov
- mostlikely
- Posts: 157
- Joined: Sat, 29. Nov 03, 23:20
-
- Posts: 283
- Joined: Sun, 3. Aug 08, 20:30
Ok, finished testing. Here's the rest of it...
Vertical Bar (|) acts as it does in normal RegEx. This is the OR function. When used outside of parentheses, it will OR the entire regular expression. Use the vertical bar inside of a parentheses to OR a portion of the regular expression.
"\(.+\) Nova" will return true for "(M3) (HMS Westbrook) Nova" or "(M3) (HMS Westbrook) (Argon Prime) Nova" but not "(M3) Nova".
Asterisk (*) acts to repeat a portion of the regular expression zero or more times.
"\(.*\) Nova" will return true for "(M3) Nova" or "(M3) (HMS Westbrook) Nova".
Vertical Bar (|) acts as it does in normal RegEx. This is the OR function. When used outside of parentheses, it will OR the entire regular expression. Use the vertical bar inside of a parentheses to OR a portion of the regular expression.
- ". Nova$|^Nova ." will return true for "Prototype Nova" or "Nova Vanguard". Notice the CARET and DOLLAR SIGN
- "^Nova (Raider|Sentinel)" will return true for "Nova Raider" or "Nova Sentinel" but not "Nova Vanguard".
- "Nova Vanguard( 0[1-9])?" will return true for "Nova Vanguard 01" or "Nova Vanguard".
- "Nova( .)?" will return true for "Nova Vanguard" or "Nova".
"\(.+\) Nova" will return true for "(M3) (HMS Westbrook) Nova" or "(M3) (HMS Westbrook) (Argon Prime) Nova" but not "(M3) Nova".
Asterisk (*) acts to repeat a portion of the regular expression zero or more times.
"\(.*\) Nova" will return true for "(M3) Nova" or "(M3) (HMS Westbrook) Nova".
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov
-
- Posts: 283
- Joined: Sun, 3. Aug 08, 20:30
I've updated the <RetVar/IF> = match regular expression: <Var/String> to string <Var/String> to contain the above information in a more organized manner.
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov
- mostlikely
- Posts: 157
- Joined: Sat, 29. Nov 03, 23:20
Note there's also
<RetVar/IF><RefObj> get local variables: regular expression=<Var/String>
and
<RetVar/IF> get global variables: regular expression=<Var/String>
Which could be usefull in a way I suppose.
Perhaps
<RetVar> = find position of pattern <Var/String> in <Var/String>
and
<RetVar> = substitute in string<Var/String>: pattern <Var/String> with <Var/String>
also use regular expressions (I'm afraid I can't check myself right now)?
The best use for regular expressions are (in my opinion) substitutions.
It's a shame there's no
<RetVar> = get substring of <Var/String> pattern=<Var/String> occurrence=<Var/Number>
or
<RetVar> = get occurrences in string<Var/String> pattern=<Var/String>
But I guess such things can be made by custom scripts.
<RetVar/IF><RefObj> get local variables: regular expression=<Var/String>
and
<RetVar/IF> get global variables: regular expression=<Var/String>
Which could be usefull in a way I suppose.
Perhaps
<RetVar> = find position of pattern <Var/String> in <Var/String>
and
<RetVar> = substitute in string<Var/String>: pattern <Var/String> with <Var/String>
also use regular expressions (I'm afraid I can't check myself right now)?
The best use for regular expressions are (in my opinion) substitutions.
It's a shame there's no
<RetVar> = get substring of <Var/String> pattern=<Var/String> occurrence=<Var/Number>
or
<RetVar> = get occurrences in string<Var/String> pattern=<Var/String>
But I guess such things can be made by custom scripts.
-
- Posts: 269
- Joined: Sat, 20. May 06, 14:55
Please allow me to note: a pattern does not return or generate a word. It matches words. Only grammar can deflate and thus return words. More formally spoken: grammar defines words, word defines regular expressions, regular expression defines grammar.WindsOfBoreas wrote:So "^(Falcon|Nova) .* 0[1-9]" would return
Falcon Vanguard 02
All packt with your favorite volume of theory of computer science