Author Archives: Captain C

The Case of the Jumping Dialog

We recently noticed a new bug in the IDE where a dialog that hadn’t been updated for quite some time suddenly started misbehaving: it would appear at the wrong coordinates (0,0), and then jump to the proper ones.

At first glance this this looked like a classic case of creating the dialog in a visible state, and then moving it during the CREATE event, but checking the dialog properties in the Form Designer showed that the dialog’s Visible property was “Hidden”, so this wasn’t the source of the problem.

Stepping through the CREATE event in the debugger showed that the dialog was indeed created hidden, but then became visible during an RList() call that performed a SELECT statement, ergo RList() was allowing some queued event code to execute (probably from an internal call to the Yield() procedure) and that was changing the dialog’s VISIBLE property.

Checking the other events revealed that the SIZE event contained the following code:

Call Set_Property( @Window, "REDRAW", FALSE$ )

// Move some controls around

Call Set_Property( @Window, "REDRAW", TRUE$ )

The REDRAW property works by toggling an object’s WS_VISIBLE style bit – when set to FALSE$ the style bit is removed but the object is not redrawn. When set to TRUE$ the object is marked as visible and redrawn.

So, putting all this together: creating the dialog generated a queued SIZE event, which under normal circumstances would run after the CREATE event. However, the Yield() call in RList() executed the SIZE event before the dialog was ready to be shown, and the REDRAW operation made the dialog visible before the CREATE event had moved it to the correct position.

The fix for this was to ensure that the REDRAW property wasn’t set to TRUE$ if it’s original value wasn’t TRUE$, like so:

bRedraw = Set_Property( @Window, "REDRAW", FALSE$ )

// Move some controls around

If bRedraw Then
   Call Set_Property( @Window, "REDRAW", TRUE$ )
End

Conclusion

  • Always protect calls to the REDRAW property by checking its state before you set it, and then only turn it back on again if it was set to TRUE$ originally.
  • Calling Yield() can cause events to run out of the normal sequence, and you should be aware of this when writing your application.

PDF files and the FILEPREVIEW control

The next release of OpenInsight (version 10.1) includes a couple of updates to the FILEPREVIEW control as a result of using it extensively “out in the field”, and in this post we thought we’d look at these changes and why we made them in case you encounter the same issues yourself.

The Adobe problem

As mentioned in this previous post, the FILEPREVIEW control relies on third-party DLLs to provide “preview handlers” that OpenInsight uses to display the contents of files such as Word or PDF documents. However, what we found is that not all of these handlers are created equal and some can be quite problematic – in our case the Adobe PDF preview handler (supplied with the Adobe PDF Reader) proved to be one of these.

When the handler is loaded by OpenInsight one of the things that must be specified is the context in which it is created – this can be “in-process” (which means it runs in the same address space as OpenInsight) or “out-of-process” (which runs as a separate executable). This is done internally by a set of flags, and when you use the FILENAME property these flags are set to their default values which, until recently, had proved sufficient. However, extensive testing (by Martyn at RevSoft) found that the Adobe PDF preview handler had stopped working, and further investigation revealed that at some point recent versions of this had become sensitive to these context flags, so the first change we made was to provide a new SETFILENAME method, which allows you to set the flags yourself if need be:

The SETFILENAME method

RetVal = Exec_Method( CtrlEntID, "SETFILENAME", FileName, FileExtn, |
                      ContextFlags )
ParameterRequiredDescription
FileNameNoContains the name and path of the file to preview (can be null to remove the preview).
FileExtnNoSpecifies an explicit extension to use, overriding the extension passed in the FileName parameter.
ContextFlagsNoSpecifies a bit-mask of “CLSCTX_” flags used to create the preview handler. Defaults to:

  BitOr( CLSCTX_INPROC_SERVER$, CLSCTX_LOCAL_SERVER$ )

(Equates for these flags can be found in the MSWIN_CLSCTX_EQUATES insert record)

If the returned value is 0 then the operation was successful, otherwise this is an error code reported from Windows and can be passed to the RTI_ErrorText stored procedure to get the details:

E.g.

// Load the PDF in an out-of-process context
$Insert MSWin_ClsCtx_Equates

RetVal = Exec_Method( CtrlEntID, "SETFILENAME", "C:\Temp\Test.PDF", "",
                      CLSCTX_LOCAL_SERVER$ )
If RetVal Then
   // Problem...
   ErrorText = RTI_ErrorText( "WIN", RetVal )
End

Even with this you may still find problems, as the above code was fine for me, but not for Martyn, even though the PDF preview handler worked fine in Windows Explorer itself for both of us! So, we could only conclude that Adobe made sure that the handler worked with the Windows Explorer, but they were less concerned about third party applications (Per-monitor DPI settings are also not supported by the preview handler which is disappointing as well).

The Foxit solution

After some more testing we decided to switch to the Foxit PDF reader which worked as expected, so we would recommend using this for PDF previewing in future if needed.

Removing the FILENAME property at design-time

One other change we made was to remove the FILENAME property from the Form Designer so that it could not be set at design-time due to the following reasons:

  • We had reports that once it had been set it was very difficult to select the control again in the Form Designer, because it basically takes over mouse handling!
  • Document previewing is deemed to more of a run-time operation than a design-time operation.
  • The FILENAME property is deprecated in favor of the SETFILENAME method because the latter provides a more complete API. The FILENAME property is still supported however, and will be going forwards.

Conclusion

So, for v10.1 we have provided a new SETFILENAME method to provide a better interface for file-previewing which gives more feedback and more control, and you should use this in preference to the FILENAME property.

We have also found the Adobe PDF preview handler to be somewhat temperamental in use so would recommend the Foxit preview handler instead if you have problems with the former (Note however, that other preview handlers we use regularly, such as Word, Excel and PowerPoint have all worked well without any issues so far).

The Saga of ShellExecute

One of the most popular “raw” Windows API functions that OpenInsight developers have used over the years is the ShellExecute function, which allows you to launch an application via its filename association, e.g. you can launch Word by using a document file name, or Excel using a spreadsheet filename and so on.

However, because it was never really made an “official” part of the product (it was normally passed on in forums), developers were left to create their own DLL Prototype definitions in order to use it – this gave rise to many variations over the years, many of which were not compatible with others. For example, some use LPCHAR as an argument type, some use LPSTR or LPASTR, whilst others use LPVOID with GetPointer(); some definitions use the “Wide” version of the function, some the “Ansi” version, and there are many different aliases, with or without the “A/W” suffix too. The list goes on.

For OpenInsight 10 we decided that we couldn’t move forward with this as we would run the risk of conflicting with established applications, so we moved all of the DLL Prototypes we used into a new namespace called “MSWIN_” and claimed it as our own. This left developers to bring forward their own DLL prototypes into version 10 as and when needed, and therefore we didn’t supply a “ShellExecute” function as such, though we did supply “MsWin_ShellExecute” instead (see below).

Another decision we took was to try and move away from the need for developers to use raw Windows API function calls as much as possible, as some of them can be complex and require knowledge of C/C++ programming, which is not necessarily a skill set that everyone has the time or desire to learn. Ergo, we moved a lot of functionality into the Presentation Server (PS) and created some Basic+ wrapper functions around others to shield developers from the sometimes gory internals.

(We also chose to use the “W” versions of functions rather than the “A” versions where possible, because these would translate better when in UTF8 mode and remove the need for an extra “A”->”W” conversion in Windows itself.)

So, coming back to ShellExecute, and in light of the above, we have three “official” and supported ways of calling it in OpenInsight 10 as detailed below:

  • The SYSTEM object SHELLEXEC method
  • The RTI_ShellExecuteEx stored procedure
  • The MSWin_ShellExecute DLL Prototype stored procedure

The SYSTEM object SHELLEXEC method

If your program is running in “Event Context”, (i.e. it is executing in response to an event originating from the PS) then you may use the SYSTEM SHELLEXEC method which invokes ShellExecuteW internally.

RetVal = Exec_Method( "SYSTEM", "SHELLEXEC", OwnerForm, Operation, File, |
                      Parameters, WorkingDir, ShowCmd )
ParameterRequiredDescription
OwnerFormNoName of a form to use as a parent for displaying UI messages.
OperationNoOperation to be performed; “open”, “edit”, “print” etc.
FileYesFile to perform the operation on.
ParametersNoIf File is an executable file this argument should specify the command line parameters to pass to it.
WorkingDirNoThe default working directory for the operation. If null the current working directory is used.
ShowCmdNoDetermines how an application is displayed when it is opened (as per the normal VISIBLE property).

The return value is the value returned by ShellExecuteW.

The RTI_ShellExecuteEx method

This stored procedure is a wrapper around the Windows API ShellExecuteExW function (which is used internally by ShellExecuteW itself), and may be used outside of event context – it can also return the handle to any new process it starts as a result of executing the document. As you can see it’s quite similar to the SHELLEXEC method:

RetVal = RTI_ShellExecuteEx( Hwnd, Verb, File, Parameters, |
                             Directory, nShow, hProcess )
ParameterRequiredDescription
HwndYesHandle of a window to use as a parent for displaying UI messages, or null (0) to use the desktop.
VerbNoOperation to be performed; “open”, “edit”, “print” etc.
FileYesFile to perform the operation on.
ParametersNoIf File is an executable file this argument should specify the command line parameters to pass to it.
DirectoryNoThe default working directory for the operation. If null the current working directory is used.
nShowNoDetermines how an application is displayed when it is opened (as per the normal VISIBLE property).
hProcessNoReturns the handle to the new process.

The return value is the value returned by ShellExecuteExW.

The MSWin_ShellExecute DLL Prototype stored procedure

This is the “raw” DLL function that is included with OI10, and the definition can be found in the MSWIN_SHELL32 DLLPROTOTYPE entity:

Shows the MSWIN_SHELL32 DLLPROTOTYPE entity

Because we’re using LPWSTR data types there is no need to null-terminate any of your variables so using it is quite simple:

RetVal = MsWin_ShellExecute( 0, "open", "stuff.docx", "", "c:\docs", 1 )

Migrating ShellExecute

Whilst you are free to use one of the methods outlined above, this may not be optimal if you are still sharing code between your existing version 9 application and your new version 10 one. In this case there are a couple of options you could use:

  • Define your preferred DLL prototype in v10.
  • Use a wrapper procedure and conditional compilation.

Defining your own prototype

This is probably the easiest option – you simply use the same prototype in v10 that you did in version 9, with the same alias (if any), and this way the code that uses it doesn’t need to be changed. The only downside to this if you’ve used any 32-bit specific data types instead of 32/64-bit safe types like HANDLE (this could happen if you have a really old prototype) – you must ensure that you use types that are 64-bit compliant.

Using conditional compilation

This is a technique we used when writing the initial parts of v10 in a v9 system so our stored procedures would run the correct code depending on the platform they were executing on (it was actually first used to share code between ARev and OI many years ago!).

The v10 Basic+ compiler defines a token called “REVENG64” which is not present in the v9 compiler – this means that you can check for this in your source code with “#ifdef/#ifndef” directives and write code for the different compiler versions.

For example, you could write your own wrapper procedure for ShellExecute that looks something like this:

Compile Function My_ShellExecute( hwnd, verb, file, params, dir, nShow )

#ifdef REVENG64
   // V10 Compiler - use RTI function
   Declare Function RTI_ShellExecuteEx
   RetVal = RTI_ShellExecuteEx( hwnd, verb, file, params, dir, nShow, "" )
#endif

#ifndef REVENG64
   // V9 Compiler - use existing "raw" prototype
   Declare Function ShellExecute
   RetVal = ShellExecute( hwnd, verb, file, params, dir, nShow )
#endif

Return RetVal

And then call My_ShellExecute from your own code.

So, there ends the Saga of ShellExecute … at least for now.

Methods, Events, and Documentation

In a recent post we provided a preview of the OpenInsight IMAGE API documentation for the upcoming release of version 10.1. As that proved quite popular we thought we’d provide some more, this time dealing with the Common GUI API (i.e. the basic interface that virtually every GUI object supports) and the WINDOW object API – two core areas of OI GUI programming.

Methods, not Events

One thing you may notice as you look through these documents is the addition of many new methods, such as SHOWOPTIONS or QBFCLOSESESSION – this is an attempt to tidy up the API into a more logical and coherent format that is a better fit for an object-based interface.

As we went through the product in order to document it, it became very apparent that there were many instances where events were being used to mimic methods, such as sending a WRITE event to save the data in a form, or sending a CLICK event to simulate a button click and so on. In object-based terminology this sort of operation would be performed by a method, which is a directive that performs an action – the event is a notification in response to that action. So, for example, you would call a “write” method to save your data and the system would raise a “write” event so you could deal with it.

Of course, this distinction will probably not bother many developers – just API purists like myself, but this does have another advantage if you like to use Object Notation Syntax (I do) – you can now perform actions such as reading and writing form data by using the”->” notation, whereas before you would have to use the Send_Event stored procedure which essentially breaks the object-based paradigm.

So instead of:

   Call Send_Event( @Window, "WRITE" )

you would use the form’s WRITEROW method instead:

   @@Window->WriteRow( "" )

which is a more natural fit for this style of programming.

(It is also easier to explain to new OI programmers who are used to other object-based languages and environments where everything is properties, methods and events).

Methods, not Stored Procedures

This brings us finally onto the topic of Stored Procedures and the object API, where several of these also fulfill the role of methods. For example, take the venerable Msg stored procedure used to display a message box for a parent form – a different way of treating this would be to have a SHOWMESSAGE method for the parent form rather than using a “raw” Msg call. Likewise for starting a new form: instead of using the raw Start_Window procedure, the SYSTEM and WINDOW objects now support a STARTFORM method instead.

Of course, none of this changes your existing code, nor is it enforced, it’s just something you can use if and when you wish to. However, even if my API pedantry hasn’t persuaded you to change your coding style, some of the new methods are worth investigating as they provide a better opportunity for us to extend the product’s functionality further – take a look at the WINDOW READROW and WRITEROW methods for an example of this – they support new features that we couldn’t do with just sending events.

In any case, here are the links – hopefully some light reading for your weekend!

Reordering tabs with the AllowDragReorder property

The next release of OpenInsight includes a new TABCONTROL property called ALLOWDRAGREORDER, which allows you to drag a tab to a new position within the control. It’s a simple boolean property, and when set to True tabs may be dragged and reordered with the mouse – an image of the tab is moved with the cursor, and a pair of arrows are displayed to mark the potential insertion point.

Here’s an example of a tab being dragged in the IDE:

Shows a image of the IDE with a tab being dragged by a cursor, along with the drag0image and the insertion marker arrows.

Bonus trivia

  • The tabs may be scrolled while dragging by hovering outside either edge of the control.
  • This property is not supported for:
    • MultiLine tab controls
    • Vertically-aligned tab controls
  • The LISTBOX control also supports this property for reordering its items – see the “Order Tabs” dialog in the Form Designer for an example, or the list of types in the IDE’s “Open Entity” dialog.

Screenshots with the CAPTUREIMAGE method

Bitmap controls in OpenInsight 10 have a method called CAPTUREIMAGE, which allows you to “screenshot” the contents of another OI control or form into the Bitmap control’s IMAGE sub-object. As you can see, it has a very simple interface:

SuccessFlag = Exec_Method( BitMapCtrlID, "CAPTUREIMAGE", CaptureID )

Where “CaptureID” is the fully qualified name of the control to screenshot.

E.g.

If we have a form called TEST_CAPTUREIMAGE, with a BITMAP control called BMP_SCREENSHOT, then we can screenshot the contents of the IDE into it like so:

Call Exec_Method( "TEST_CAPTUREIMAGE.BMP_SCREENSHOT", "CAPTUREIMAGE", |
                   "RTI_IDE" )
Shows a captured image of the OpenInsight IDE in a Bitmap control.

(N.B. The captured image you see displayed above is scaled – the screenshot is stored at full resolution in the control itself)

One obvious use for this is for support purposes, e.g:

  • Take a screen-shot with CAPTUREIMAGE.
  • Use The SAVETOFILE method in the IMAGE API to save it to a file.
  • Create an email message with the image file attached or embedded and send it to your support desk.

We’re sure you can think of more.

Bonus trivia:

  • CAPTUREIMAGE works with any object that supports the Windows WM_PRINTCLIENT message.
  • BITMAP controls are basically an alias for STATIC controls, so all STATIC controls support this method.

Context Menu updates

The next release of OpenInsight sees a few updates to context menus and the ContextMenu Designer, so in this post we’ll take a brief look at these upcoming changes.

Moving the focus

One important aspect of standard Windows context menu behavior is that the focus is moved (if possible) to the control that the menu belongs to. Current versions of OpenInsight do not follow this pattern so the next release includes a fix for this, and this is something you should be aware of just in case it impacts your application (though to be honest, we’re not really expecting it to!).

Test-Run support

The Context-Menu Designer now supports the IDE “Test-Run” feature, so that you can see how your context menu will appear when you use it in your application.

When you test-run your context menu you will see a simple dialog box with an edit control (EDL_TEST) and and a static control (TXT_TEST) like so:

Test-run context menu dialog box

Right-clicking either of these controls displays your context menu:

Selecting an item displays it’s fully-qualified name, which has the standard format of:

<windowName> "." <controlName> ".CONTEXTMENU." <itemName>

So, for the test run dialog, it will be one of the following:

"RTI_DSN_CONTEXTMENU_TESTRUN.EDL_TEST.CONTEXTMENU." <itemName>
"RTI_DSN_CONTEXTMENU_TESTRUN.TXT_TEST.CONTEXTMENU." <itemName>

E.g.

Message box showing the name of the menu item that was clicked

Common menu support

The initial release of the ContextMenu Designer in v10.0.8 included check-boxes for two “common menu” options as shown in the screenshot below. Each of these options appends a set of standard menu items to your context menu, and both have been enhanced for the next release and include new artwork as well.

Shows the Content Menu designer with the  "Include OI Menu" and "Include Windows Menu" check-boxes highlighted.

The “OI Menu” appends the following items:

  • Options – Display options for the current control.
  • Help – Display help for the current control.
  • Data Binding – Display data-binding information for the current control.

Whilst the “Windows Menu” appends the following standard “Edit” items instead:

  • Undo
  • Cut
  • Copy
  • Paste
  • Delete
  • Select All

In both cases the default system CONTEXTMENU event (i.e. the event responsible for actually displaying the menu) synchronizes the items to the parent control by using the HELPFLAGS and EDITSTATEFLAGS properties respectively.

(The definition for these items can be found in the SYSPROG “OIMENU_” and “WINMENU_” ContextMenu entities respectively – you may adjust these if you wish, but be aware that they may be overwritten in future OpenInsight updates, so you should make copies in your own application).

The @MENUPARENT pseudo-control name

When using QuickEvents there are several pseudo-control names you can use, such as “@WINDOW”, “@FOCUS” and “@SELF”, that are resolved to a “real” control name at runtime.

However, in order to be able to reference the context menu’s parent control at runtime we’ve introduced a new pseudo-control name called “@MENUPARENT”. This resolves to the name of the control displaying the menu and should be used in place of “@FOCUS” because it is perfectly possible for controls that don’t accept the focus (like Static Text controls) to have a context menu, and @FOCUS would not resolve to the correct value. Note that @MENUPARENT can only be used with MENU QuickEvents for context menu items – it cannot be used with any other type or event.

Shows the @MENUPARENT pseudo-control name being used for a menu QuickEvent

Context menus are an essential part of modern user interface design and we encourage you to use them as much as possible in your own applications – hopefully you’ll find that the tools provided in OpenInsight 10 make this easy to achieve!

The SHOWDATABINDING method

In the next release of OpenInsight we’ve added a new feature that allows you to quickly display runtime databinding information for the controls in your application – the aptly named SHOWDATABINDING method.

It’s a simple method that is supported by all controls, and can be invoked like so:

Call Exec_Method( CtrlEntID, "SHOWDATABINDING" )

If the control is bound to a database table then it displays a view-only dialog of data binding information for that control. The following example shows the information for a bound column in an EditTable control:

Dialog box showing an example of the databinding information displayed via the SHOWDATABINDING method.

The Description, Validation, Heading and Formula attributes all have their own sub-dialog boxes to display their full details.

If the control is not databound a simple message is displayed to inform the user of the fact.

This method can easily be added to menu or contextmenu QuickEvents in your own applications if you wish to expose this information to your users, or just for your own diagnostic purposes.

String comparison in OpenInsight – Part 3 – Linguistic Mode

Welcome to the final part of this mini series on the string comparison mechanics in OpenInsight. In the first two parts we reviewed how this task is currently handled in both ANSI and UTF8 modes, but this time we’ll take a look at a new capability introduced for the next release which is called the “Linguistic String Comparison Mode”.

As we’ve seen previously, there is certainly room for improvement when dealing with string comparisons and sorting in non-English languages, mainly due to the burden placed on the developer to maintain the sorting parameters, especially once the requirements extend beyond the basic ANSI character set. There is also no advantage taken of the capabilities of Windows itself, which provides a comprehensive National Language Support (NLS) API for testing strings for linguistic equality.

What is “linguistic equality”?

If you’re unfamiliar with the term “linguistic equality” it essentially means comparing strings according to the language rules of a specific locale thereby providing appropriate results for a user of that locale. For example, consider the following cases that illustrate how comparisons differ for the same characters in different locales:

  • Many locales equate the ae ligature (æ) with the letters ae. However, Icelandic (Iceland) considers it a separate letter and places it after Z in the sorting sequence.
  • The A Ring (Å) normally sorts with merely a diacritic difference from A. However, Swedish (Sweden) places the A Ring after Z in the sorting sequence.

In a standard OI system these sort of rules would need the developer to define the collation sequence records that represent them, which is simply duplicating effort when Windows itself is easily capable of handling this for us.

Using Linguistic Mode

In order to utilize this API without impacting current systems we have introduced a new “mode” into OpenInsight that allows you to determine exactly when you wish to enable linguistic support. This mode comprises three elements:

  1. Mode ID – this is the mode itself, which can be one of the following values:
    • (0) Normal, non-linguistic mode.
    • (1) Linguistic mode.
  2. Mode Flags – A set of bit-wise flags for use with the Linguistic mode.
  3. Mode Locale – A locale identifier for use with the Linguistic mode (defaults to the current user’s locale).

It’s simply a case of setting the mode when you want it to apply to sorting and case-insensitive operations, and turning it off when you don’t. Just like with Extended Precision Mode you can set a default mode for your application and then adjust this at runtime as desired.

(Note that using the Linguistic mode is not affected by OpenInsight’s ANSI or UTF8 mode, as the string comparisons are processed “outside” in Windows itself.)

The following five functions are used to control the Linguistic Mode:

  • GetDefaultStrCmpMode – returns the default application mode settings.
  • SetDefaultStrCmpMode – sets the default application mode.
  • GetStrCmpMode – returns the current mode settings.
  • SetStrCmpMode – sets the current mode.
  • GetStrCmpStatus – returns the status of a string comparison operation.

Along with this set of equates:

  • RTI_STRCMPMODE_EQUATES
  • MSWIN_COMPARESTRING_EQUATES

Example:

$Insert RTI_StrCmpMode_Equates
$Insert MSWin_CompareString_Equates

// Set the mode to Linguistic, sorting digits as numbers, case-insensitive, 
// and with linguistic casing, using the "en-UK" locale
SCFlags = BitOr( LINGUISTIC_IGNORECASE$, NORM_LINGUISTIC_CASING$ )
SCFlags = BitOr( SCFlags, SORT_DIGITSASNUMBERS$ )

Call SetStrCmpMode( STRCMPMODE_LINGUISTIC$, SCFlags, "en-UK" ) 

// Now do some sorting ...
Call V119( "S", "", "A", "L", data, "" )

Full details on each of these functions can be found at the end of this post, but let’s take a look in more detail at the each of the mode settings:

Mode ID

This is an integer value that controls how string comparisons are made:

When set to “0” then the application will run in “normal” mode, which means that string comparisons will use the methods described in parts 1 and 2 of this series. The Mode Flags and Mode Locale settings are ignored.

When set to “1” the application uses the Windows CompareStringEx function for string comparisons instead. The Mode Flags and the Mode Locale settings will also be used with this.

Mode Flags

This setting is a integer comprising one or more optional bit-flags that are passed to the Windows CompareStringEx function when running in Linguistic Mode (It may be set to 0 to apply the default behavior). A full description of their use can be found in the Microsoft documentation for the CompareStringEx function, but briefly these are:

FlagDescription
LINGUISTIC_IGNORECASE$Ignore case, as linguistically appropriate.
LINGUISTIC_IGNOREDIACRITIC$Ignore nonspacing characters, as linguistically appropriate.
NORM_IGNORECASE$Ignore case.
NORM_IGNOREKANATYPE$Do not differentiate between hiragana and katakana characters.
NORM_IGNORENONSPACE$Ignore nonspacing characters.
NORM_IGNORESYMBOLS$Ignore symbols and punctuation.
NORM_IGNOREWIDTH$Ignore the difference between half-width and full-width characters.
NORM_LINGUISTIC_CASING$Use the default linguistic rules for casing, instead of file system rules.
SORT_DIGITSASNUMBERS$Treat digits as numbers during sorting.
SORT_STRINGSORT$Treat punctuation the same as symbols.

Mode Locale

This is can be the name of the locale to use (like “en-US”, “de-CH” etc.), or one of the following special values:

  • “0” or null – Use the current user locale (LOCALE_NAME_USER_DEFAULT).
  • “1” – Use the current OS locale (LOCALE_NAME_SYSTEM_DEFAULT).
  • “2” – Use an invariant locale that provides stable locale and calendar data (LOCALE_NAME_INVARIANT)

The Linguistic Mode and Basic+

The following Basic+ operators and functions are affected by the Linguistic Mode :

  • LT operator
  • LE operator
  • EQ operator
  • NE operator
  • GE operator
  • GT operator
  • _LTC operator
  • _LEC operator
  • _EQC operator
  • _NEC operator
  • _GEC operator
  • _GTC operator
  • IndexC function
  • V119 function
  • Locate By statement
  • LocateC statement

Note that when used with the case-insensitive operators and functions (such as _eqc, IndexC() etc.) the LINGUISTIC_IGNORECASE$ flag is always applied if the NORM_IGNORECASE$ has not been specified.

Performance considerations

Using the Linguistic Mode can impact performance for two reasons:

  1. There is just more work to do – comparison of strings using more complex rules will always be slower that a simple comparison of ordinal byte values or code points.
  2. The strings must be copied and transformed into UTF16 (wide) strings before passing to the Windows CompareStringEx function. While this is not a slow operation in and of itself it will add some overhead.

Because of this Linguistic Mode is not enabled by default – you are free choose when to apply it yourself.

String Comparison Mode functions

GetDefaultStrCmpMode function

This function returns an @fm-delimited dynamic array containing the current default string comparison mode settings for the application in the format:

<1> Mode
<2> Flags
<3> Locale

Example:

$Insert RTI_StrCmpMode_Equates

DefSCM  = GetDefaultStrCmpMode()
DefMode = DefSCM<GETSTRCMPMODE_MODE$>

SetDefaultStrCmpMode function

This function sets the default string comparison mode for an application. The mode is set to these default values for each new request made to the engine (i.e each event or web-request).  This is to protect against situations where an error condition could force the engine to abort processing before the mode could be reset, thereby leaving it in an unknown state.

This function takes three arguments:

NameDescription
ModeSpecifies the default mode to set: “0” for Normal mode, or “1” for Linguistic Mode.
FlagsBitmask integer that specifies the default flags to use when in Linguistic Mode
LocaleSpecifies the name of the default locale to use.

Example:

$Insert RTI_StrCmpMode_Equates
$Insert MSWin_CompareString_Equates

// Set the default mode to Linguistic, sorting digits as numbers, using the
// user's locale
SCFlags = SORT_DIGITSASNUMBERS$

Call SetDefaultStrCmpMode( STRCMPMODE_LINGUISTIC$, SCFlags, "" ) 

GetStrCmpMode function

This function returns an @fm-delimited dynamic array containing the current string comparison mode settings for the application in the format

<1> Mode
<2> Flags
<3> Locale

Example:

$Insert RTI_StrCmpMode_Equates

CurrSCMode = GetStrCmpMode()<GETSTRCMPMODE_MODE$>

SetStrCmpMode function

This function sets the current string comparison mode for an application. Note that the mode is set to the default values for each new request made to the engine (i.e each event or web-request).  This is to protect against situations where an error condition could force the engine to abort processing before the mode could be reset, thereby leaving it in an unknown state.

This function takes three arguments:

NameDescription
ModeSpecifies the mode to set: “0” for Normal mode, or “1” for Linguistic Mode.
FlagsBitmask integer that specifies the flags to use when in Linguistic Mode
LocaleSpecifies the name of the locale to use.

Example:

$Insert RTI_StrCmpMode_Equates
$Insert MSWin_CompareString_Equates

// Set the mode to Linguistic, sorting digits as numbers, case-insensitive, 
// and with linguistic casing, using the "en-UK" locale
SCFlags = BitOr( LINGUISTIC_IGNORECASE$, NORM_LINGUISTIC_CASING$ )
SCFlags = BitOr( SCFlags, SORT_DIGITSASNUMBERS$ )

Call SetStrCmpMode( STRCMPMODE_LINGUISTIC$, SCFlags, "en-UK" ) 

// Now do some sorting ...
Call V119( "S", "", "A", "L", data, "" )

GetStrCmpStatus function

While it is unlikely that the CompareStringEx function will raise any errors it is possible if incompatible flags or parameters are used. In this case Windows returns an error code which may be accessed in Basic+ via this function (See the CompareStringEx documentation for more details on error values).

Example:

$Insert RTI_StrCmpMode_Equates
$Insert MSWin_CompareString_Equates

// Set the mode to Linguistic, sorting digits as numbers, case-insensitive, 
// and with linguistic casing, using the "en-UK" locale
SCFlags = BitOr( LINGUISTIC_IGNORECASE$, NORM_LINGUISTIC_CASING$ )
SCFlags = BitOr( SCFlags, SORT_DIGITSASNUMBERS$ )

Call SetStrCmpMode( STRCMPMODE_LINGUISTIC$, SCFlags, "en-UK" ) 

// Now do some sorting ...
Call V119( "S", "", "A", "L", data, "" )

SCError = GetStrCmpStatus()
If SCError Then
   ErrorText = RTI_ErrorText( "WIN", SCError )
End

Conclusion

This concludes this mini-series on OpenInsight string comparison processing. Hopefully you’ll find the new Linguistic Mode useful in your own applications, bearing in mind that some of the custom sorting options, such as “Treat Digits As Numbers”, can have a use in any application beyond simply dealing with non-English language sets.

Some of the more astute readers among you may have noticed that no mention of indexing has been made so far with respect to Linguistic Mode. This is because work is currently ongoing in this part of the system, and we’ll give you more details regarding this at a later date.

Further reading

More information on this subject may be found here:

String comparison in OpenInsight – Part 2 – UTF8 Mode

Welcome to the second part of our mini-series explaining the mechanics of how string comparisons are handled in OpenInsight. In our previous post we looked at the inner workings when running in ANSI mode – this time we’ll look at UTF8 mode instead.

(Note that we’ve included some Basic+ pseudo-code examples in this post to illustrate more clearly how some parts of the comparison routines work. These are simplifications of the actual C++ internal functions and not actual code from the system itself.)

String comparison in UTF8 Mode

In UTF8 mode characters can be multi-byte and therefore have a value greater than 255 (normally referred to as their “code point”, or in Basic+ terms, the Seq() value of a character), so this means that the standard ANSI-mode method described previously cannot be used. Instead, a slightly different approach is taken to allow higher code points to be included in custom sorting.

When the system is loaded the UTF8 library creates an internal character-map (called the “ANSI-map”) which is a 256-element array (0-255) of code-point values. This is initialized to the same values as the standard ANSI character set, i.e. position 65 will have the code point for the ANSI character with the value of 65, position 230 will have the code point for the ANSI character with the value of 230 and so on.

This ANSI-map this can be changed at runtime so that code points that are higher than 255 can be included, and code points that appear in the ANSI-map are always sorted lower than those that aren’t, regardless of their actual value. The following functions (exported from RevUTF8.dll) are used to query and update the ANSI-map:

GetAnsiToUnicode – returns the code point for a specified map element.

// MapIndex - must be an integer between 0 and 255 
CodePoint = GetAnsiToUnicode( MapIndex )  

SetAnsiToUnicode – updates the code point for a specified map element.

// MapIndex - must be an integer between 0 and 255
// NewCodePoint - integer value of the code point to set
Call SetAnsiToUnicode( MapIndex, NewCodePoint )

UTF8 comparison method

When comparing two characters we first need to find a “sort index” for a character which is determined as follows:

  • Get the code point value for the character being compared.
  • Look in the ANSI-map using the low byte value of the code point as the index. If the value at that position is the same as the character code point then the sort index is set to that index and it is marked as “found”.
    • E.g. If the character has a code point value of 458 (0x1CA) then it’s low-byte value is 202 (0xCA). If the ANSI-map contains the value 458 at index 202 then the sort index is set to 202 and it is marked as “found”.
  • Otherwise, scan backwards through the ANSI-map looking for an element that has the same value as the code-point for the character. If we match it then the sort index is set to the same position and it is marked as “found”.
// Pseudo-code
dim ansiMap( 255 )

sortIndex = -1 ; // Not found
codePoint = seq( ch )
testIndex = bitAnd( codePoint1, 0xFF )
if ( ansiMap( testIndex ) == codePoint ) then
   // Found
   sortIndex = testIndex
end else
   // Not found
   for testIndex = 255 to 0 step -1
      if ( ansiMap( testIndex ) == codePoint ) then
         // Found and exit loop
         sortIndex = testIndex
      end         
   next
end

Once this has been done for both characters we use the following comparison procedure:

  • If one of the characters is marked as “not found” and the other as “found”, the latter is always sorted before the former.
  • Otherwise we now proceed in a manner similar to the ANSI comparison:
    • If we are using a collation sequence the sorting value for each character is extracted from the appropriate sequence using the sort index we determined above.
      • E.g. if the sort index was 202 then the sort value for the comparison is the value of the byte at position 203 (1-based) in the sequence.
    • If we are using a case-insensitive comparison without a collation sequence the two sort indexes (not values!) are masked with 0xDF and compared.
    • If we are using a case-sensitive comparison without a collation sequence the two original code-point values are compared.
// Pseudo-code
begin case
   case ( sortIndex1 == -1 ) and ( sortIndex2 == -1 )
      // Both Non-ANSI-mapped - use a simple code point compare
      cmpVal = codePoint1 - codePoint2
   case ( sortIndex1 == -1 )
      // sortIndex2 was found in the ANSI map so it's sorted lower
      cmpVal = 1
   case ( sortIndex2 == -1 )
      // sortIndex1 was found in the ANSI map so it's sorted lower
      cmpVal = -1
   case OTHERWISE$
      // Both are ANSI mapped
      begin case
         case hasCollationSequence
            sortVal1 = seq( collationSequence[sortIndex1+1,1] )
            sortVal2 = seq( collationSequence[sortIndex2+1,1] )
            
         case isCaseInsensitive         
            sortVal1 = bitAnd( sortIndex1, 0xDF )
            sortVal2 = bitAnd( sortIndex2, 0xDF )
            
         case OTHERWISE$
            sortVal1 = codePoint1
            sortVal2 = codePoint2
            
      end case
      
      cmpVal = sortVal1 - sortVal2
      
end case

So, this system works pretty well out of the box for languages that can be expressed using the ANSI character set, but for other languages much of the burden falls on the application developer to maintain and tune the language settings and collation sequences to their requirements. This could require considerable effort and ignores much of the functionality provided by the OS itself, so in the next post we’ll take a look at how this is being addressed in the next release.