Tag Archives: Optimizations

String comparison in OpenInsight – Part 4 – String-Only operators

Basic+ is a dynamically-typed language, and sometimes the system will attempt to change the type of a variable before using it in an operation. For example, if we have variable containing a numeric value as a string, and we attempt to compare it to a variable containing an actual numeric value, then the former will be coerced to a numeric format before the comparison, e.g.:

   StrVar = "3"  ; // String representing an integer
   NumVar = 7    ; // Integer

   Ans = ( StrVar = NumVar ) ; // Ans is FALSE$ and StrVar 
                             ; // is converted to an Integer

This works well in the vast majority of cases, but causes issues when attempting to compare strings that that could be numbers but really aren’t for the purposes of the comparison. For example, consider the following:

   Var1 = "12"    ; // String
   Var2 = "012"   ; // String

   Ans = ( VarA = VarB ) ; // Ans = TRUE$ - should be FALSE$!!

The system first attempts to coerce the operands to a numeric format, and of course it can because they both evaluate to the number 12, and are therefore treated as equal. If you wish to compare them both as strings, the usual solution has been to change them so that they cannot be converted to a number, usually by prefixing them with a letter or symbol like so:

   Var1 = "12"    ; // String
   Var2 = "012"   ; // String

   Ans = ( ( "X" : VarA ) = ( "X" : VarB ) ) ; // Ans = FALSE$

Although this works it hurts performance and can be a cumbersome solution.

OpenInsight 10.2 introduces a new set of extended comparison operators to Basic+ that ensure both operands are coerced to a string type (if needed) before being evaluated, and provides the following benefits:

  • Removes the need for any concatenation thereby avoiding unnecessary copying
  • Removes the need to check if a variable can be a number before the comparison
  • Makes it obvious that you are performing a “string-only” comparison so your code looks cleaner.

The new (case-sensitive) operators are:

OperatorDescription
_eqsEquals operator
_nesNot Equals operator
_ltsLess Than operator
_lesLess Than Or Equals operator
_gtsGreater Than operator
_gesGreater Than Or Equals operator

There is also matching set of case-insensitive operators too:

OperatorDescription
_eqscCase-Insensitive Equals operator
_nescCase-Insensitive Not Equals operator
_ltscCase-Insensitive Less Than operator
_lescCase-Insensitive Less Than Or Equals operator
_gtscCase-Insensitive Greater Than operator
_gescCase-Insensitive Greater Than Or Equals operator

So we could write the previous example like so:

   Var1 = "12"    ; // String
   Var2 = "012"   ; // String

   Ans = ( VarA _eqs VarB ) ; // Ans = FALSE$

Hopefully you’ll find this a useful addition when performing string-only comparisons in your own applications.

BLen is the new GetByteSize

As one observant commenter noticed in our last post there’s a new Basic+ function called “BLen” in the version 10 compiler.  This is simply a synonym for the standard GetByteSize function, and was added to:

  1. Save me some typing effort (very important)
  2. Fit in with some of the other binary functions like BRemove and BCol2.

Of course, you may be wondering why GetByteSize/BLen is being used so much that I got tired of typing it?  It’s simply that as we progress through the v10 codebase we’re updating the code to be  “UTF8-safe” – i.e. we’re aiming to ensure that we don’t lose any performance when running in UTF8 mode, and a common Basic+ programming pattern for detecting a non-null variable is this:

   If Len( someVar ) Then
      // Variable is not null
   End

Variables in Basic+ are length-encoded, i.e. they cache the number of bytes that they occupy in memory.  When running in ANSI mode the Len statement simply returns this number (because 1-byte always equals 1 character) so if it’s zero you know you don’t have any data. However, because UTF8 is a multi-byte character-encoding format, the Len statement in UTF8-mode has to scan the contents of the entire variable to count the number of characters – it can’t use the cached byte-count.  This means that a simple check with Len could trigger this counting process when all you really want to know is if the variable contains data, and this could impact performance when dealing with large strings or arrays.

So, the best option is to use GetByteSize rather than Len, which always returns the cached byte-count regardless of ANSI or UTF8-mode, but as I don’t like typing very much you can now use BLen instead.

If you’re interested in writing UTF8-safe code and you’re not familiar with the Basic+ binary functions, you can find more details on them in a series of posts I wrote a few years ago on the Sprezzatura blog.  You may also want to check out the Internationalization section in the OI Coding Standards document too for some more UTF8-mode hints and tips.

(Disclaimer: This article is based on preliminary information and may be subject to change in the final release version of OpenInsight 10).

 

 

 

 

 

Size and Position

In previous versions of OpenInsight the position of a Window or control has always been determined by its SIZE property – an @fm-delimited array comprised of the Left, Top, Width and Height values.  This means, of course, that all of these values have to be processed together at the same time.

For example, if you only want to update a control’s Top attribute then you first have to get the current SIZE property, update the second field, and then set the entire array again like so:

ctrlSize    = get_Property( ctrlID, "SIZE" )
ctrlSize<2> = 100
call set_Property_Only( ctrlID, "SIZE", ctrlSize )

This can become tedious to write and it is also inefficient.

For version 10 we’ve exposed each of the SIZE fields as separate properties so you can now access them directly.  The new properties are:

  • LEFT
  • TOP
  • WIDTH
  • HEIGHT

Here’s the previous example updated to use the TOP property:

call set_Property_Only( ctrlID, "TOP", 100 )

Positioning using rectangle (RECT) coordinates

Those of you used to working with the Windows API will know that many API functions don’t use Width and Height values when dealing with positioning: they work with a “RECT” structure that uses Left, Top, Right and Bottom values to define an object’s position instead (i.e. the coordinates of the top-left corner and the bottom-right corner).

In some cases being able to update a position using Right and Bottom instead of Width and Height can actually be more efficient because it can mean less calculations needed in your own code, and so a new RECT property has been added to enable this functionality.

The new RECT property works in exactly the same way as the current SIZE property except that the Width and Height fields have been replaced by the Right and Bottom fields like so:

* // RECT property structure
* // 
* //   <1> Left
* //   <2> Top
* //   <3> Right
* //   <4> Bottom

As with the SIZE property we’ve also exposed the  individual fields as separate properties so there are two new properties to complement RECT which are:

  • RIGHT
  • BOTTOM

 

 SIZE, RECT and “stealth mode”

As you may know, when using the SIZE property with the Set_Property function you can set a “visible” attribute in the 5th field that can control the visibility of the object when it is moved. For example, setting the SIZE of an invisible WINDOW makes it visible by default unless you set this visible flag to “-1”.  This is still the case in version 10 and it also applies to the new RECT property as well.

However, we have also added a new 6th field that can contain a “Suppress Change Notification” flag.  When this flag is set to TRUE$ the object that has been moved receives no internal notification from Windows that it has been updated, so this will stop any MOVE and SIZE events from begin raised as well as preventing any autosize processing.  This is sometimes necessary when you have complex positioning requirements.

* // Full SIZE property structure when used with Set_Property
* // 
* //   <1> Left
* //   <2> Top
* //   <3> Width
* //   <4> Height
* //   <5> Visibility
* //   <6> Suppress Change Notification
* // Full RECT property structure when used with Set_Property
* // 
* //   <1> Left
* //   <2> Top
* //   <3> Right
* //   <4> Bottom
* //   <5> Visibility
* //   <6> Suppress Change Notification

(Disclaimer: This article is based on preliminary information and may be subject to change in the final release version of OpenInsight 10).

The new STYLE_N property

If you’ve ever had to work with the STYLE or STYLE_EX properties in OpenInsight you’ve probably had to write some gnarly code like this to get around the strange way it returns a string containing a C-style hex representation such as “0x8234ABDB”:

style = iconv( get_Property( object, "STYLE" )[3,8], "MX" )

We’ll probably never know the reason the original authors of OpenInsight decided to implement the property like this, but with version 10 we’ve provided a pair of new properties called “STYLE_N” and “STYLE_EX_N” that return the numeric (decimal) representation of a control’s style directly thereby saving the need for the extra parsing.

The Set_Property_Only subroutine

One of the problems with the existing Set_Property function is that it also performs an implicit Get_Property operation to return the previous contents of the property to the caller.  In many cases this information is simply discarded after the “Set” making the “Get” call actually unnecessary, and this inefficiency goes unnoticed as the quantity of information retrieved during the “Get” is small.  However, in some cases the effect is noticeable: For example, setting the ARRAY property of an EditTable that contains several thousand rows to null (i.e. clearing it), will produce a noticeable delay while the ARRAY contents are accessed and returned.

A new subroutine, called Set_Property_Only, has been added to the OpenInsight 10.  This operates in the same way as the existing Set_Property function except for the fact that it doesn’t bother performing a Get_Property operation to return the previous property contents to the caller.  This helps you to optimize your code for better performance.