'match regular expression' or 'if' statement?

The place to discuss scripting and game modifications for X³: Terran Conflict and X³: Albion Prelude.

Moderators: Moderators for English X Forum, Scripting / Modding Moderators

Post Reply
(----____JEFF____----)
Posts: 269
Joined: Sat, 20. May 06, 14:55
x3tc

'match regular expression' or 'if' statement?

Post by (----____JEFF____----) » Mon, 12. Jan 09, 13:05

Hi all, I've got a question about the 'match regular expression' code (General Commands -> Strings -> <RetVar/IF> match regular expression: <Var/String> to string <Var/String>).

It seems to be the same as an (==) if statement, so why did egosoft implant it then? I think I'm missing something here, so if you can say what I'd like to know. The only obvious difference I see is the fact that one is designed to compare strings and the other anything that can be compared. The only thing I can think of is that it has better performance then a normal if statement (faster/more efficient). If you know the answer I'd like to know and thanks in advance ;)

Edit: I think I figured out what it does. You can compare a word to other words/phrases, so it returns true when you compare "car" to "cartoon" but false the other way around. The only thing to watch out for is that it is case sensitive, so "Car" and "car" return false.

User avatar
s9ilent
Posts: 2033
Joined: Wed, 29. Jun 05, 01:45
x4

Post by s9ilent » Mon, 12. Jan 09, 13:56

Hrm... interesting.. it could just be one of those "shortcuts" that they put in. Like Skip if/Skipifnot

Or readtext to array from min to max (which could've easily been done via a loop)

Have you tried it with case/special charachters?



And as to whether its more efficient, hopefully you would think that it is, but there is probably no way we'll find out, and for such a basic command, it wouldn't really use much cpu-ness

User avatar
mostlikely
Posts: 157
Joined: Sat, 29. Nov 03, 23:20
x4

Post by mostlikely » Mon, 12. Jan 09, 13:56

Regular expression wiki

The short story is that regular expressions allow you to do more then a basic character for character matching of strings.

Regular expressions are basically 'patterns' in text you want to search for instead of the exact text.

For example you want to find out if a ship's name contains 'Nova'.. now you can do this by cutting up the ship's name into pieces but you could also use an regular expression like this: '<any text> Nova <any text>'.
Which would find 'Enhanced Nova' but also 'Nova prototype'.
The actual regular expression would look something like '.* Nova .*' (I don't know the exact syntax for x3 mind)

It's hard to come up with actual concrete examples where you would use it. But when you do find a use for regular expressions it usually saves lots of code and time because you can compare/manipulate strings without having to dissect them character by character.

(----____JEFF____----)
Posts: 269
Joined: Sat, 20. May 06, 14:55
x3tc

Post by (----____JEFF____----) » Mon, 12. Jan 09, 14:18

yeah I tried special characters and capitals, special characters work fine, but capitals and non capitals return false ("Nova" and "nova" is false, "Nova Raider" and "Nova Vanguard" is true, like mostlikely said ;)). So for now I don't really see many advantages yet, as upper cases are considered different from their lower case variants, but I see it can be useful for non-user input type comparisons.

Anyway thanks guys ;)

User avatar
mostlikely
Posts: 157
Joined: Sat, 29. Nov 03, 23:20
x4

Post by mostlikely » Mon, 12. Jan 09, 16:28

(----____JEFF____----) wrote:but capitals and non capitals return false.
Keep in mind that regular expressions are not case insensitive until you tell them to. Again I don't know the exact syntax but it's also possible to use 'or' operations in regular expressions:
(n|N)ova = (n or N)ova
would match Nova and nova.

I can't check atm but are there perhaps any examples in the standard X3 scripts of this function? Perhaps that would help to figure it out as I'm pretty sure it 'should' be able to do what you want.

(----____JEFF____----)
Posts: 269
Joined: Sat, 20. May 06, 14:55
x3tc

Post by (----____JEFF____----) » Mon, 12. Jan 09, 16:46

mostlikely this is the syntax.

Code: Select all

<RetVar/IF> match regular expression: <Var/String> to string <Var/String>
Where <RetVar/IF> can be a if/skip if statement or a variable to put the output in.
The <Var/String>'s can be a string or a variable containing a string. I've no idea if TC has some form of wild card or OR for use in this statement. It does have OR for regular if statements, but that's only for regular if statements afaik (so only like this: if $x == $y OR $z == $y). I also don't know if there are any scripts atm that use this code.

How I tested it so far is by just entering 2 strings and writing the return value to a log.

The thing I want to use this for also works fine with a if statement, I was just curious what it did/how it worked and if I could use it.

WindsOfBoreas
Posts: 283
Joined: Sun, 3. Aug 08, 20:30

Post by WindsOfBoreas » Mon, 12. Jan 09, 21:14

mostlikely meant the syntax of the regular expression which would be the first string variable.

In normal usage, a regular expression would look like "^(Falcon|Nova) . 0[1-9]". This would (say you're searching a database and not within X3) return all the variants of the Falcon and the Nova - Prototype, Raider, Sentinel, Vanguard - which had been serialized between 01 to 09. The period (.) is a wildcard that would only return if there is text between the "Falcon" or "Nova" string and the serial. The caret (^) denotes that if any text is before the "Falcon" or "Nova" string, the entry is to be ignored.

So "^(Falcon|Nova) . 0[1-9]" would return

Falcon Vanguard 02

or

Nova Nebula 04

but not

Falcon 03

or

(Blue) Nova Raider 07

_________________________

You would use something like this within a loop. Say you got an array of player owned ships within your current sector, but you only wanted to control the Novas and any variants that had been assigned a designation by you at the start of the name.

You would use "^\(.\) (.)? Nova (.)?" to narrow the array down with that regular expression.

Now, this would be rather esoteric as you would undoubtedly have assigned something useful between the beginning parentheses like a homebase or wing which you would simply use the respective commands in place of this.

However, if you were making something like a renaming script, this would be perfect for mass renames in using this command as a way to exclude certain objects.
Last edited by WindsOfBoreas on Tue, 13. Jan 09, 20:32, edited 2 times in total.
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov

(----____JEFF____----)
Posts: 269
Joined: Sat, 20. May 06, 14:55
x3tc

Post by (----____JEFF____----) » Mon, 12. Jan 09, 21:45

Aahh.. ok, I thought he just meant the syntax of the code :roll::shock: Anyway does that work in X3TC WindsOfBoreas?

User avatar
Gazz
Posts: 13244
Joined: Fri, 13. Jan 06, 16:39
x4

Post by Gazz » Mon, 12. Jan 09, 21:58

I'm mostly curious about how much of the theoretical syntax actually works with the script command. =)
My complete script download page. . . . . . I AM THE LAW!
There is no sense crying over every mistake. You just keep on trying till you run out of cake.

WindsOfBoreas
Posts: 283
Joined: Sun, 3. Aug 08, 20:30

Post by WindsOfBoreas » Mon, 12. Jan 09, 22:10

Most likely none of the syntax I laid out works in X3. That is all from the HTML regular expression code (though most of it does work in the rest of the RegEx engines).

I'll do some testing of it right after dinner here and get back to you.
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov

User avatar
Chealec
Posts: 1916
Joined: Sun, 20. Aug 06, 10:54
x3tc

Post by Chealec » Mon, 12. Jan 09, 23:56

However, if you were making something like a renaming script
It's unlikely to work because I _think_ and I'm by no means certain (I'm better with Perl than "X-Scripting"), that the RegExp matching in X3 is just that, matching.

This mean you can't assign flags (such as "i" for case-insensitive matching) or use parentheses to store matches for extraction. So you can't extract the variant type using something like:

Code: Select all

$sVariant =~ s/^.*\s*Nova\s+([a-z]+)?.*$/$1/i
(as a Perl example)

Because in reality you've only got access to the "match" part of the RegExp, not the full construction.

Code: Select all

^.*\s*Nova\s+([a-z]+)?.*$
There is also likely to be a slight overhead with using RegExps as opposed to just plain string matching - it won't be great, but it's probably best to use RegExps only where you'd otherwise end up doing masses of nested conditionals.
[ external image ]

... old skool

WindsOfBoreas
Posts: 283
Joined: Sun, 3. Aug 08, 20:30

Post by WindsOfBoreas » Tue, 13. Jan 09, 00:14

Chealec has pointed out a great deal. X3 is more about matching than anything else, as I have found in testing this command.

So far, this is what I've found to be true:

The Regular Expression is case sensitive.

Period (.) acts as the wildcard as in normal RegEx. However, the similarities end at that. In normal RegEx, the period is used for a single character; in X3, the period is used as an unlimited number of characters including whitespace and non-alphanumeric characters.

". Nova" will return true for "Prototype Nova" or "Hey, look over there; it's a Nova" but not "Nova Vanguard".


Caret (^) acts to search for the beginning of a string. This does not search for line-breaks as in normal RegEx.

"^Nova" will return true for "Nova Vanguard" or "Novas are really, really tasty" but not "Prototype Nova".


Dollar Sign ($) acts to search for the end of a string. This does not search for the character or word before a line-break as in normal RegEx.

"Nova$" will return true for "Prototype Nova" or "The rain in Argon Prime falls mainly on the Nova" but not "Nova Vanguard".


Backslash (\) will negate the special character that follows it - caret, dollar sign, period. This differs from normal RegEx where the backslash negates the special character that precedes it.

"\^ Nova" will return true for "^ Nova" or "The rain in Argon Prime...look ^ Nova" but not "Nova".


Brackets ([, ]) create a character class for searching through a series. Such would be [0-9] or [a-d]. These are special characters only when used together.

"Nova 0[1-5]" will return true for "Nova 05" but not "Nova 06".
  • Caret (^) when used at the beginning of a character class ( [^a-d] ) will turn the character class into the opposite.

    "[^abc] Nova" will return true for "D Nova" or "C Nova" but not "a Nova".
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov

User avatar
mostlikely
Posts: 157
Joined: Sat, 29. Nov 03, 23:20
x4

Post by mostlikely » Tue, 13. Jan 09, 00:32

Great work.

I'm sure this will be of good use for someone.

WindsOfBoreas
Posts: 283
Joined: Sun, 3. Aug 08, 20:30

Post by WindsOfBoreas » Tue, 13. Jan 09, 02:25

Ok, finished testing. Here's the rest of it...

Vertical Bar (|) acts as it does in normal RegEx. This is the OR function. When used outside of parentheses, it will OR the entire regular expression. Use the vertical bar inside of a parentheses to OR a portion of the regular expression.
  • ". Nova$|^Nova ." will return true for "Prototype Nova" or "Nova Vanguard". Notice the CARET and DOLLAR SIGN
  • "^Nova (Raider|Sentinel)" will return true for "Nova Raider" or "Nova Sentinel" but not "Nova Vanguard".
Question Mark (?) acts to make a portion of the regular expression preceding the question mark optional. Can be used on a letter, a digit, a word, a character class, or a special character.
  • "Nova Vanguard( 0[1-9])?" will return true for "Nova Vanguard 01" or "Nova Vanguard".
  • "Nova( .)?" will return true for "Nova Vanguard" or "Nova".
Plus Sign (+) acts to repeat a portion of the regular expression one or more times.

"\(.+\) Nova" will return true for "(M3) (HMS Westbrook) Nova" or "(M3) (HMS Westbrook) (Argon Prime) Nova" but not "(M3) Nova".


Asterisk (*) acts to repeat a portion of the regular expression zero or more times.

"\(.*\) Nova" will return true for "(M3) Nova" or "(M3) (HMS Westbrook) Nova".
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov

WindsOfBoreas
Posts: 283
Joined: Sun, 3. Aug 08, 20:30

Post by WindsOfBoreas » Tue, 13. Jan 09, 03:37

I've updated the <RetVar/IF> = match regular expression: <Var/String> to string <Var/String> to contain the above information in a more organized manner.
"Humanity has the stars in its future, and that future is too important to be lost under the burden of juvenile folly and ignorant superstition." - Isaac Asimov

User avatar
mostlikely
Posts: 157
Joined: Sat, 29. Nov 03, 23:20
x4

Post by mostlikely » Tue, 13. Jan 09, 09:07

Note there's also
<RetVar/IF><RefObj> get local variables: regular expression=<Var/String>
and
<RetVar/IF> get global variables: regular expression=<Var/String>
Which could be usefull in a way I suppose.

Perhaps
<RetVar> = find position of pattern <Var/String> in <Var/String>
and
<RetVar> = substitute in string<Var/String>: pattern <Var/String> with <Var/String>
also use regular expressions (I'm afraid I can't check myself right now)?
The best use for regular expressions are (in my opinion) substitutions.

It's a shame there's no
<RetVar> = get substring of <Var/String> pattern=<Var/String> occurrence=<Var/Number>
or
<RetVar> = get occurrences in string<Var/String> pattern=<Var/String>
But I guess such things can be made by custom scripts.

(----____JEFF____----)
Posts: 269
Joined: Sat, 20. May 06, 14:55
x3tc

Post by (----____JEFF____----) » Tue, 13. Jan 09, 11:01

Thanks a lot WindsOfBoreas, that's some very interesting and useful information :D:wink:

Jar B
Posts: 292
Joined: Tue, 18. Jan 05, 14:07
x4

Post by Jar B » Tue, 13. Jan 09, 20:16

WindsOfBoreas wrote:So "^(Falcon|Nova) .* 0[1-9]" would return

Falcon Vanguard 02
Please allow me to note: a pattern does not return or generate a word. It matches words. Only grammar can deflate and thus return words. More formally spoken: grammar defines words, word defines regular expressions, regular expression defines grammar.

All packt with your favorite volume of theory of computer science ;)

Post Reply

Return to “X³: Terran Conflict / Albion Prelude - Scripts and Modding”