Basic+ is a dynamically-typed language, and sometimes the system will attempt to change the type of a variable before using it in an operation. For example, if we have variable containing a numeric value as a string, and we attempt to compare it to a variable containing an actual numeric value, then the former will be coerced to a numeric format before the comparison, e.g.:
StrVar = "3" ; // String representing an integer
NumVar = 7 ; // Integer
Ans = ( StrVar = NumVar ) ; // Ans is FALSE$ and StrVar
; // is converted to an Integer
This works well in the vast majority of cases, but causes issues when attempting to compare strings that that could be numbers but really aren’t for the purposes of the comparison. For example, consider the following:
Var1 = "12" ; // String
Var2 = "012" ; // String
Ans = ( VarA = VarB ) ; // Ans = TRUE$ - should be FALSE$!!
The system first attempts to coerce the operands to a numeric format, and of course it can because they both evaluate to the number 12, and are therefore treated as equal. If you wish to compare them both as strings, the usual solution has been to change them so that they cannot be converted to a number, usually by prefixing them with a letter or symbol like so:
Var1 = "12" ; // String
Var2 = "012" ; // String
Ans = ( ( "X" : VarA ) = ( "X" : VarB ) ) ; // Ans = FALSE$
Although this works it hurts performance and can be a cumbersome solution.
OpenInsight 10.2 introduces a new set of extended comparison operators to Basic+ that ensure both operands are coerced to a string type (if needed) before being evaluated, and provides the following benefits:
Removes the need for any concatenation thereby avoiding unnecessary copying
Removes the need to check if a variable can be a number before the comparison
Makes it obvious that you are performing a “string-only” comparison so your code looks cleaner.
The new (case-sensitive) operators are:
Operator
Description
_eqs
Equals operator
_nes
Not Equals operator
_lts
Less Than operator
_les
Less Than Or Equals operator
_gts
Greater Than operator
_ges
Greater Than Or Equals operator
There is also matching set of case-insensitive operators too:
Operator
Description
_eqsc
Case-Insensitive Equals operator
_nesc
Case-Insensitive Not Equals operator
_ltsc
Case-Insensitive Less Than operator
_lesc
Case-Insensitive Less Than Or Equals operator
_gtsc
Case-Insensitive Greater Than operator
_gesc
Case-Insensitive Greater Than Or Equals operator
So we could write the previous example like so:
Var1 = "12" ; // String
Var2 = "012" ; // String
Ans = ( VarA _eqs VarB ) ; // Ans = FALSE$
Hopefully you’ll find this a useful addition when performing string-only comparisons in your own applications.
With the release of version 10.1 a new control type called DIRWATCHER (“Directory Watcher”) has been added to OpenInsight. This is a fairly simple control which allows you to monitor one or more directories on your system and then receive notifications when the contents are changed.
Using the control is very straightforward:
Use the WATCHDIR method to add a directory to monitor for changes.
Handle the CHANGED event to receive notifications of directory changes.
Use the STOP method to stop monitoring directories.
We’ll take a quick look at each of these methods and events below along with a couple of important properties:
The WATCHDIR method
This method allows you to specify a directory to monitor along with some optional flags. It may be called multiple times to watch more than one directory.
The flag values are specified in the MSWIN_FILENOTIFY_EQUATES insert record,
This method returns TRUE$ if successful, or FALSE$ otherwise.
The STOP method
This method stops the control monitoring its specified directories.
bSuccess = Exec_Method( CtrlEntID, "STOP" )
This method returns TRUE$ if successful, or FALSE$ otherwise.
(Note – to resume directory monitoring after the STOP method has been called the WATCHDIR method(s) must be executed again).
The CHANGED event
This event is raised when changes have been detected in the monitored directories.
bForward = CHANGED( NewData )
This event passes a single parameter called NewData which contains an @vm-delimited list of changed items (i.e. notifications). Each item in the list comprises an “action code” and the name and path of the affected file, delimited by an @svm.
Action codes are defined in the MSWIN_FILENOTIFY_EQUATES insert record like so:
equ FILE_ACTION_ADDED$ to 0x00000001
equ FILE_ACTION_REMOVED$ to 0x00000002
equ FILE_ACTION_MODIFIED$ to 0x00000003
equ FILE_ACTION_RENAMED_OLD_NAME$ to 0x00000004
equ FILE_ACTION_RENAMED_NEW_NAME$ to 0x00000005
Remarks
If a monitored directory experiences a high volume of changes (such as copying or removing thousands of files) it could generate a correspondingly high number of CHANGED events, which in turn could produce an adverse affect on your application and slow it down. In order to deal with this potential issue it is possible to “bundle up” multiple notifications with the NOTIFYTHRESHOLD property into a single CHANGED event so they may be processed more efficiently.
The NOTIFYTHRESHOLD property
The NOTIFYTHRESHOLD property is an integer that specifies the maximum number of notifications that should be bundled before a CHANGED event is raised.
The NOTIFYTIMER property is an integer that specifies the number of milliseconds before a CHANGED event is raised if the NOTIFYTHRESHOLD property value is not met.
If the NOTIFYTHRESHOLD property is set to a value greater than 1 then the control will try to bundle that number of notifications together before raising a CHANGED event. However, when this is set to a high value it is possible that the threshold may not be reached in a timely fashion and the CHANGED event not actually raised.
E.g. If the NOTIFYTHRESHOLD is set to 1000, and only 200 notifications are received then the CHANGED event would not be raised.
To prevent this problem the NOTIFYTIMER property may be used to specify the amount of time after receiving the last notification before a CHANGED event is raised even if the NOTIFYTHRESHOLD is not met.
E.g. in the example above, if the control had a NOTIFYTIMER of 50, then a CHANGED event would be raised 50ms after the last notification (200) was received, even though the NOTIFYTHRESHOLD of 1000 has not actually been met.
Developer Notes
The DIRWATCHER control is intended as a “non-visual” control and should probably be hidden at runtime in your own applications. However, it is actually derived from a normal STATIC control so all of the properties and methods that apply to a STATIC apply to the DIRWATCHER as well, and you may use them as normal if you wish.
When executing a long running process in a desktop environment, such as selecting records from a large table, it is important for the user to be able to interact with the application even though it is busy. A common example of this is displaying a dialog box that shows the progress of an operation and allowing the user to cancel it if they wish. Failure to provide this ability generally results in user frustration, or causes the dreaded “Window Ghosting” effect where Windows changes a form’s caption to “Not Responding” (this is never a good look, and usually ends in a quick visit to the Task Manager to kill the application).
In order to avoid this problem we have to allow the application to check for user interaction, a process usually referred to as “yielding” (hence the awful title of this post), and this time we’ll take a look at the various options available to accomplish this and the differences between them. Before we go any further however, here’s a little background information on how OpenInsight runs beneath the hood so that you can appreciate how messages and events are handled.
Under the hood
OpenInsight.exe (aka. the “Presentation Server”, or “PS”) has a main thread (the “UI thread”) with a Windows message loop that manages all of the forms and controls, and it also has an internal “event queue” for storing Basic+ events that need to be executed. The PS also creates an instance of the RevEngine virtual machine (“the engine”), which has its own thread (the “engine thread”) with a Windows message loop, and is responsible for executing Basic+ code.
When the PS needs to execute an event it passes it to the engine directly if possible, otherwise it adds the details to the event queue and then posts a message to itself so the queue can be checked and processed when the engine is not busy. When the engine receives the event data it is executed on the engine thread. Stored procedures such as Get_Property, Set_Property, and Exec_Method provide a way for the Basic+ event to communicate back to the PS to interact with the user interface controls and forms during its execution.
The key point to note here is that Basic+ event code runs in a different thread to the UI, so while the engine thread is processing the event, the UI thread is basically waiting for it to finish, and this means that it may or may not get chance to process it’s own message loop. This is where the problems can begin, and why the need for a yielding ability, because:
The engine thread needs to be paused or interrupted in some fashion so that the UI thread can check and process its own Windows message queue for things like mouse, keyboard and paint messages. If this queue is not checked at least every 10 seconds then Windows assumes that the PS is hung and the “Not Responding” captions are shown on the application forms.
While the engine is processing an event, the PS cannot pass it a new one, so it is added to the event queue. If we are waiting to process some Basic+ event code like a button CLICK to cancel the current operation, then we need some way for this to be retrieved and executed before the current event is finished.
So, now we know why “window ghosting” happens we can take a look at the various options to deal with it.
Options for yielding
MSWin_Sleep stored procedure
This is a direct call to the Windows API Sleep function, and it puts the engine thread to sleep for at least the specified number of milliseconds. However, while calling this will allow Windows to schedule another thread to run, there’s no guarantee that this would be the UI thread, so it’s not really a good solution.
WinYield stored procedure
This is a simple wrapper around the Windows API Sleep function, with a sleep-time of 10ms. This suffers from the same disadvantages discussed for MSWin_Sleep above (This function remains for backwards compatibility with early versions of of OI and Windows).
MSWin_SwitchToThread stored procedure
This is a direct call to the Windows API SwitchToThread function which forces Windows to schedule another thread for execution. Like MSWin_Sleep and WinYield there’s no guarantee that this would cause the UI thread to run, so again it’s not a great solution.
SYSTEM PROCESSEVENTS method (a.k.a Yield stored procedure)
This is a new method in version 10.1 that performs two tasks that solve the problem:
It explicitly tells the UI thread to process its message queue (which will avoid the “ghosting” issue), and
It allows the UI thread to process the event queue so waiting events can be executed as well.
One possible drawback here is that waiting events are also processed, and this might not be a desirable outcome depending on what you are doing. In this case there is another method called PROCESSWINMGS that should be used instead.
(FYI – The PROCESSEVENTS method is essentially a wrapper around the venerable Yield() stored procedure, but allows the same functionality to be called via the standard object-based method API rather than as a “flat” function. Yield() itself is still and will be fully supported in the product).
SYSTEM PROCESSWINMSGS method
This is a new method in version 10.1 that tells the PS to process it’s Windows message queue but it does not process any Basic+ events, i.e. it prevents the “ghosting” effect but does not cause events to fire before your application is ready for them.
Conclusion
Version 10.1 has added more functionality to help you avoid the dreaded “Not Responding” message via the PROCESSWINMGS and PROCESSEVENTS methods, and hopefully, armed with the information above, this will help you to write better integrated desktop-applications.
RList is the OpenInsight tool for queries and ad hoc reports. OpenInsight 10 implements an extended version of RList via the new RTI_RLISTX stored procedure which offers additional features for selecting and reporting data. However, you do not need to change existing programs to take advantage of this.
This post takes a look at some of the new RList capabilities, along with a full description of the API and some code examples.
OLIST/RUN_REPORT syntax support
LIST statements can use the same syntax as RUN_REPORT or OLIST if the output is going to the screen or printer.
With OLIST LIST statements you can use keywords like GRID and PDFFILE. When the target is TARGET_PRINTER$, RList calls OLIST_PRODUCE to render the output.
Stacked selects
RList accepts a stack of SELECT statements and optionally one output (LIST) statement. RList will execute SELECT statements until zero records are selected or a LIST statement is executed. This allows developers to break a complex query into a series of simpler SELECT statements. Each subsequent SELECT statement refines the active select list.
RList to BRW
The TARGET_BRWDESIGN$ option will create a BRW (Banded Report Writer) report from an RList statement and open it in the BRW Designer for the developer to refine it. RList will prompt for the report group name. The intent is to let you quickly rough out a report or even a set of master/detail reports using RList, then use the BRW Designer to refine the result.
RList to a variable
The TARGET_VARIABLE$ option will return the result of an RList statement into a variable. For example, you can get a list of keys without using loop/readnext, you can populate an edit table with a SELECT statement, and you can obtain CSV, XML or JSON data by calling RList. This is similar to the OpenInsight 9 SELECT_INTO stored procedure. The SELECT_INTO syntax is still supported and now calls RTI_RLISTX internally to implement the commands instead.
Cursor support
RList support allows you to specify a cursor number between 0 and 8, or -1 to use the next non-zero cursor. OpenInsight 9 offered cursor support but the RList interface did not. Cursors permit sub-queries that don’t corrupt the main select loop. Sub-queries can operate on other BFS’s too. For example OpenInsight calculated columns can query SQL tables.
Reduce, Select by, ReadNext, Swap.Cursor, Make.list, Save_Select, Activate_Save_Select all support cursors in OpenInsight 10.
Performance enhancements
RList implements many optimizations in selecting and reporting:
It will use indexes to refine an existing select, whereas previous versions only use indexes on a select without an active select list.
Caching can reduce server IO so RList can now cache rows as they are read if it knows that the same rows will be sorted or reported – previous versions always read the records to select, sort and report.
RESOLVE_SELECT, the program which finalizes a select, is improved. If you specify the number of rows, RESOLVE_SELECT exits when it reaches that number of rows rather than resolving all of the rows before applying the limit.
RList calls a new routine, RTI_CHAIN_SELECT, to pre-process selects which has some query optimization built in. For example, it will select on indexed fields before non-indexed, and hard fields before calculated. It performs sorts after the all selects are completed.
This parameter should contain one or more OpenList (SELECT/LIST) statements, separated by field marks. RList will process each statement sequentially until it exhausts the list of keys, selects zero rows, or executes a LIST statement. Note that in OpenInsight 10 RList accepts the same syntax as the classic OLIST or RUN_REPORT procedures as well as that used in TCL and any legacy OpenInsight 9 syntax.
Target parameter
Target is a code indicating the desired output format. OpenInsight supplies an insert which enumerates the options for target. See the RLIST_EQUATES insert record for more details.
Target
Value
Description
TARGET_PRINTER$
0
Sends the output of a LIST statement to the printer.
TARGET_CLIENT$
1
Sends the output of a LIST statement to the screen.
TARGET_VARIABLE$
2
Returns the output of a LIST statement to a variable.
TARGET_CALLBACK$
3
Triggers an RLIST_CALLBACK routine.
TARGET_SAVELIST$
4
Performs a SAVE_SELECT operation on the result of select statements.
TARGET_ACTIVELIST$
5
Activates a cursor with the result of select statements.
TARGET_LATENTLIST$
6
Creates a latent cursor for subsequent ReadNext processing.
TARGET_CHECKSYNTAX$
7
Checks the statements for valid syntax but does not execute them.
TARGET_CALLBACK_NOFMT$
8
Triggers an RLIST_CALLBACK routine but with no formatting or truncation applied to the returned values.
TARGET_BRWDESIGN$
9
Generate a Banded report, and open it in the designer.
TARGET_BRWRUN$
10
Generate a Banded report, and execute immediately.
TargetName parameter
This parameter is polymorphic. You supply different values for different combinations of target and SELECT or LIST statements:
Target
Select Statement
LIST statement
TARGET_PRINTER$
N/a
N/a
TARGET_CLIENT$
N/a
N/a
TARGET_VARIABLE$
RList will return the keys if the output format is “KEYS”. See the UserArg parameter.
RList will return the output into the variable. The format of the output depends on the UserArg parameter.
TARGET_CALLBACK$
N/a
The name of an “RList callback” procedure.
TARGET_SAVELIST$
The name of the list to save. The string you pass is the list name in the SYSLISTS table. If you pass a space delimited string, RLIST will write the list to a table other than SYSLISTS using the first word as the name of the table and the second as the id of the list.
N/a
TARGET_ACTIVELIST$
N/a
N/a
TARGET_LATENTLIST$
N/a
N/a
TARGET_CHECKSYSTAX$
N/a
N/a
TARGET_CALLBACK_NOFMT$
N/a
The name of an “RList callback” procedure. RList will not enforce column widths on the output. Used by SELECT_INTO in OpenInsight 9.
TARGET_BRWDESIGN$
N/a
The name of the report group and report to generate.
TARGET_BRWRUN$
N/a
N/a
UserArg parameter
Another polymorphic parameter whose format depends on the chosen Target parameter:
Target
Select Statement
LIST statement
TARGET_PRINTER$
N/a
UserArg<2> = cursor number
TARGET_CLIENT$
N/a
UserArg<2> = cursor number
TARGET_VARIABLE$
UserArg<1> = output format. “KEYS” is the only relevant format. UserArg<2> = cursor number
UserArg<1> = ResultFormat ( see below ) UserArg<2> = cursor number
TARGET_CALLBACK$
N/a
UserArg<2> = cursor number
TARGET_SAVELIST$
UserArg<2> = cursor number
N/a
TARGET_ACTIVELIST$
UserArg<2> = cursor number
N/a
TARGET_LATENTLIST$
N/a
TARGET_CHECKSYSTAX$
N/a
N/a
TARGET_CALLBACK_NOFMT$
N/a
UserArg<2> = cursor number
TARGET_BRWDESIGN$
N/a
The name of the report group and report to generate.
TARGET_BRWRUN$
N/a
N/a
ResultFormat
ResultFormat values are applicable when returning the result to a variable using the TARGET_VARIABLE$ Target.
ResultFormat
Description
ADO
Result is an OLE record-set containing VARCHAR values.
CSV
Result is a set of comma separated values, all values quoted, commas between columns, carriage-return/linefeed characters between rows.
EDT
Edit Table format. Row 1 is column headings, @vm between columns, @fm between rows. Useful for populating edit tables using RList statements
HTML
Result is an HTML table.
JSON
Result is a JSON array of row objects, each column is an object in the row, multi-values are arrays.
KEYS
Result is an @fm-delimited list of keys.
MV
Default format. Result is an array – @fm between columns, @rm between rows. Row 1 is the column headers.
TAB
Result is a set of tab-delimited column values, carriage-return/linefeed characters between rows.
TXT
Same as TAB (see above).
XML
Result is an XML collection of rows, each row is an XML collection of columns.
Cursor Number
Cursor numbers specified in the UserArg parameter should be one of the following values:
An integer between 0 and 8
Null to use for cursor 0 (the default)
-1 for next available, which chooses an inactive cursor from 1 to 8
DebugFlag parameter
N/a.
Code Examples
RList to PDF
RList To PDF
Stacked Queries
Sometimes it’s easier to execute a series of select statements rather than a single complex query. RList accepts a list of selects with an optional closing list statement. RList will execute until completed or no records are selected. If one of the conditions selects nothing then the list statement will not run.
Stacked queries
Query using a cursor
RList now supports cursors like REDUCE and SELECT BY. Indicate the cursor in field two of the UserArg parameter (in other words, UserArg<2>). Use this to execute sub queries. The sub query can return keys from a select or the output from a LIST statement.
Query using alternate cursor
Sub-query without corrupting the main select
Sub-query example
Sub query in calculated column
You can use sub-queries in calculated columns too. A cursor variable of -1 uses the next available cursor. This allows you to nest calculated columns which perform sub-selects. If you always use subqueries rather than direct calls to btree.extract then your logic will function with or without indexes. You can make indexing choices in the performance tuning stage of development rather than when designing the dictionaries.
Calculated column using sub-query
Return results to a variable
A new target type, TARGET_VARIABLE$ ( 2 ), will return the output to a variable. Pass a variable as the third parameter and an output format as the fourth. For SELECT statements, you must specify ‘KEYS’ as the output format. For LISTS statements you supply one of the formats listed above.
Select keys into a variable
Populating an edit table control
The “EDT” output format is convenient for loading data into edit table controls:
Populate an edit table control
Conclusion
RList for OpenInsight 10 offers new features as well as improved performance. Some of the features are available in previous versions but not via the RList interface. We tried to unify the disparate query capabilities (rlist, reduce, btree.extract, select_Into, olist ) under a single interface so you can focus on functionality and we can focus on performance.
One of the most popular “raw” Windows API functions that OpenInsight developers have used over the years is the ShellExecute function, which allows you to launch an application via its filename association, e.g. you can launch Word by using a document file name, or Excel using a spreadsheet filename and so on.
However, because it was never really made an “official” part of the product (it was normally passed on in forums), developers were left to create their own DLL Prototype definitions in order to use it – this gave rise to many variations over the years, many of which were not compatible with others. For example, some use LPCHAR as an argument type, some use LPSTR or LPASTR, whilst others use LPVOID with GetPointer(); some definitions use the “Wide” version of the function, some the “Ansi” version, and there are many different aliases, with or without the “A/W” suffix too. The list goes on.
For OpenInsight 10 we decided that we couldn’t move forward with this as we would run the risk of conflicting with established applications, so we moved all of the DLL Prototypes we used into a new namespace called “MSWIN_” and claimed it as our own. This left developers to bring forward their own DLL prototypes into version 10 as and when needed, and therefore we didn’t supply a “ShellExecute” function as such, though we did supply “MsWin_ShellExecute” instead (see below).
Another decision we took was to try and move away from the need for developers to use raw Windows API function calls as much as possible, as some of them can be complex and require knowledge of C/C++ programming, which is not necessarily a skill set that everyone has the time or desire to learn. Ergo, we moved a lot of functionality into the Presentation Server (PS) and created some Basic+ wrapper functions around others to shield developers from the sometimes gory internals.
(We also chose to use the “W” versions of functions rather than the “A” versions where possible, because these would translate better when in UTF8 mode and remove the need for an extra “A”->”W” conversion in Windows itself.)
So, coming back to ShellExecute, and in light of the above, we have three “official” and supported ways of calling it in OpenInsight 10 as detailed below:
The SYSTEM object SHELLEXEC method
The RTI_ShellExecuteEx stored procedure
The MSWin_ShellExecute DLL Prototype stored procedure
The SYSTEM object SHELLEXEC method
If your program is running in “Event Context”, (i.e. it is executing in response to an event originating from the PS) then you may use the SYSTEM SHELLEXEC method which invokes ShellExecuteW internally.
Name of a form to use as a parent for displaying UI messages.
Operation
No
Operation to be performed; “open”, “edit”, “print” etc.
File
Yes
File to perform the operation on.
Parameters
No
If File is an executable file this argument should specify the command line parameters to pass to it.
WorkingDir
No
The default working directory for the operation. If null the current working directory is used.
ShowCmd
No
Determines how an application is displayed when it is opened (as per the normal VISIBLE property).
The return value is the value returned by ShellExecuteW.
The RTI_ShellExecuteEx method
This stored procedure is a wrapper around the Windows API ShellExecuteExW function (which is used internally by ShellExecuteW itself), and may be used outside of event context – it can also return the handle to any new process it starts as a result of executing the document. As you can see it’s quite similar to the SHELLEXEC method:
Whilst you are free to use one of the methods outlined above, this may not be optimal if you are still sharing code between your existing version 9 application and your new version 10 one. In this case there are a couple of options you could use:
Define your preferred DLL prototype in v10.
Use a wrapper procedure and conditional compilation.
Defining your own prototype
This is probably the easiest option – you simply use the same prototype in v10 that you did in version 9, with the same alias (if any), and this way the code that uses it doesn’t need to be changed. The only downside to this if you’ve used any 32-bit specific data types instead of 32/64-bit safe types like HANDLE (this could happen if you have a really old prototype) – you must ensure that you use types that are 64-bit compliant.
Using conditional compilation
This is a technique we used when writing the initial parts of v10 in a v9 system so our stored procedures would run the correct code depending on the platform they were executing on (it was actually first used to share code between ARev and OI many years ago!).
The v10 Basic+ compiler defines a token called “REVENG64” which is not present in the v9 compiler – this means that you can check for this in your source code with “#ifdef/#ifndef” directives and write code for the different compiler versions.
For example, you could write your own wrapper procedure for ShellExecute that looks something like this:
Compile Function My_ShellExecute( hwnd, verb, file, params, dir, nShow )
#ifdef REVENG64
// V10 Compiler - use RTI function
Declare Function RTI_ShellExecuteEx
RetVal = RTI_ShellExecuteEx( hwnd, verb, file, params, dir, nShow, "" )
#endif
#ifndef REVENG64
// V9 Compiler - use existing "raw" prototype
Declare Function ShellExecute
RetVal = ShellExecute( hwnd, verb, file, params, dir, nShow )
#endif
Return RetVal
And then call My_ShellExecute from your own code.
So, there ends the Saga of ShellExecute … at least for now.
In a recent post we provided a preview of the OpenInsight IMAGE API documentation for the upcoming release of version 10.1. As that proved quite popular we thought we’d provide some more, this time dealing with the Common GUI API (i.e. the basic interface that virtually every GUI object supports) and the WINDOW object API – two core areas of OI GUI programming.
Methods, not Events
One thing you may notice as you look through these documents is the addition of many new methods, such as SHOWOPTIONS or QBFCLOSESESSION – this is an attempt to tidy up the API into a more logical and coherent format that is a better fit for an object-based interface.
As we went through the product in order to document it, it became very apparent that there were many instances where events were being used to mimic methods, such as sending a WRITE event to save the data in a form, or sending a CLICK event to simulate a button click and so on. In object-based terminology this sort of operation would be performed by a method, which is a directive that performs an action – the event is a notification in response to that action. So, for example, you would call a “write” method to save your data and the system would raise a “write” event so you could deal with it.
Of course, this distinction will probably not bother many developers – just API purists like myself, but this does have another advantage if you like to use Object Notation Syntax (I do) – you can now perform actions such as reading and writing form data by using the”->” notation, whereas before you would have to use the Send_Event stored procedure which essentially breaks the object-based paradigm.
So instead of:
Call Send_Event( @Window, "WRITE" )
you would use the form’s WRITEROW method instead:
@@Window->WriteRow( "" )
which is a more natural fit for this style of programming.
(It is also easier to explain to new OI programmers who are used to other object-based languages and environments where everything is properties, methods and events).
Methods, not Stored Procedures
This brings us finally onto the topic of Stored Procedures and the object API, where several of these also fulfill the role of methods. For example, take the venerable Msg stored procedure used to display a message box for a parent form – a different way of treating this would be to have a SHOWMESSAGE method for the parent form rather than using a “raw” Msg call. Likewise for starting a new form: instead of using the raw Start_Window procedure, the SYSTEM and WINDOW objects now support a STARTFORM method instead.
Of course, none of this changes your existing code, nor is it enforced, it’s just something you can use if and when you wish to. However, even if my API pedantry hasn’t persuaded you to change your coding style, some of the new methods are worth investigating as they provide a better opportunity for us to extend the product’s functionality further – take a look at the WINDOW READROW and WRITEROW methods for an example of this – they support new features that we couldn’t do with just sending events.
In any case, here are the links – hopefully some light reading for your weekend!
Welcome to the final part of this mini series on the string comparison mechanics in OpenInsight. In the first two parts we reviewed how this task is currently handled in both ANSI and UTF8 modes, but this time we’ll take a look at a new capability introduced for the next release which is called the “Linguistic String Comparison Mode”.
As we’ve seen previously, there is certainly room for improvement when dealing with string comparisons and sorting in non-English languages, mainly due to the burden placed on the developer to maintain the sorting parameters, especially once the requirements extend beyond the basic ANSI character set. There is also no advantage taken of the capabilities of Windows itself, which provides a comprehensive National Language Support (NLS) API for testing strings for linguistic equality.
What is “linguistic equality”?
If you’re unfamiliar with the term “linguistic equality” it essentially means comparing strings according to the language rules of a specific locale thereby providing appropriate results for a user of that locale. For example, consider the following cases that illustrate how comparisons differ for the same characters in different locales:
Many locales equate the ae ligature (æ) with the letters ae. However, Icelandic (Iceland) considers it a separate letter and places it after Z in the sorting sequence.
The A Ring (Å) normally sorts with merely a diacritic difference from A. However, Swedish (Sweden) places the A Ring after Z in the sorting sequence.
In a standard OI system these sort of rules would need the developer to define the collation sequence records that represent them, which is simply duplicating effort when Windows itself is easily capable of handling this for us.
Using Linguistic Mode
In order to utilize this API without impacting current systems we have introduced a new “mode” into OpenInsight that allows you to determine exactly when you wish to enable linguistic support. This mode comprises three elements:
Mode ID – this is the mode itself, which can be one of the following values:
(0) Normal, non-linguistic mode.
(1) Linguistic mode.
Mode Flags – A set of bit-wise flags for use with the Linguistic mode.
Mode Locale – A locale identifier for use with the Linguistic mode (defaults to the current user’s locale).
It’s simply a case of setting the mode when you want it to apply to sorting and case-insensitive operations, and turning it off when you don’t. Just like with Extended Precision Mode you can set a default mode for your application and then adjust this at runtime as desired.
(Note that using the Linguistic mode is not affected by OpenInsight’s ANSI or UTF8 mode, as the string comparisons are processed “outside” in Windows itself.)
The following five functions are used to control the Linguistic Mode:
GetDefaultStrCmpMode – returns the default application mode settings.
SetDefaultStrCmpMode – sets the default application mode.
GetStrCmpMode – returns the current mode settings.
SetStrCmpMode – sets the current mode.
GetStrCmpStatus – returns the status of a string comparison operation.
Along with this set of equates:
RTI_STRCMPMODE_EQUATES
MSWIN_COMPARESTRING_EQUATES
Example:
$Insert RTI_StrCmpMode_Equates
$Insert MSWin_CompareString_Equates
// Set the mode to Linguistic, sorting digits as numbers, case-insensitive,
// and with linguistic casing, using the "en-UK" locale
SCFlags = BitOr( LINGUISTIC_IGNORECASE$, NORM_LINGUISTIC_CASING$ )
SCFlags = BitOr( SCFlags, SORT_DIGITSASNUMBERS$ )
Call SetStrCmpMode( STRCMPMODE_LINGUISTIC$, SCFlags, "en-UK" )
// Now do some sorting ...Call V119( "S", "", "A", "L", data, "" )
Full details on each of these functions can be found at the end of this post, but let’s take a look in more detail at the each of the mode settings:
Mode ID
This is an integer value that controls how string comparisons are made:
When set to “0” then the application will run in “normal” mode, which means that string comparisons will use the methods described in parts 1 and 2 of this series. The Mode Flags and Mode Locale settings are ignored.
When set to “1” the application uses the Windows CompareStringEx function for string comparisons instead. The Mode Flags and the Mode Locale settings will also be used with this.
Mode Flags
This setting is a integer comprising one or more optional bit-flags that are passed to the Windows CompareStringEx function when running in Linguistic Mode (It may be set to 0 to apply the default behavior). A full description of their use can be found in the Microsoft documentation for the CompareStringEx function, but briefly these are:
Flag
Description
LINGUISTIC_IGNORECASE$
Ignore case, as linguistically appropriate.
LINGUISTIC_IGNOREDIACRITIC$
Ignore nonspacing characters, as linguistically appropriate.
NORM_IGNORECASE$
Ignore case.
NORM_IGNOREKANATYPE$
Do not differentiate between hiragana and katakana characters.
NORM_IGNORENONSPACE$
Ignore nonspacing characters.
NORM_IGNORESYMBOLS$
Ignore symbols and punctuation.
NORM_IGNOREWIDTH$
Ignore the difference between half-width and full-width characters.
NORM_LINGUISTIC_CASING$
Use the default linguistic rules for casing, instead of file system rules.
SORT_DIGITSASNUMBERS$
Treat digits as numbers during sorting.
SORT_STRINGSORT$
Treat punctuation the same as symbols.
Mode Locale
This is can be the name of the locale to use (like “en-US”, “de-CH” etc.), or one of the following special values:
“0” or null – Use the current user locale (LOCALE_NAME_USER_DEFAULT).
“1” – Use the current OS locale (LOCALE_NAME_SYSTEM_DEFAULT).
“2” – Use an invariant locale that provides stable locale and calendar data (LOCALE_NAME_INVARIANT)
The Linguistic Mode and Basic+
The following Basic+ operators and functions are affected by the Linguistic Mode :
LT operator
LE operator
EQ operator
NE operator
GE operator
GT operator
_LTC operator
_LEC operator
_EQC operator
_NEC operator
_GEC operator
_GTC operator
IndexC function
V119 function
Locate By statement
LocateC statement
Note that when used with the case-insensitive operators and functions (such as _eqc, IndexC() etc.) the LINGUISTIC_IGNORECASE$ flag is always applied if the NORM_IGNORECASE$ has not been specified.
Performance considerations
Using the Linguistic Mode can impact performance for two reasons:
There is just more work to do – comparison of strings using more complex rules will always be slower that a simple comparison of ordinal byte values or code points.
The strings must be copied and transformed into UTF16 (wide) strings before passing to the Windows CompareStringEx function. While this is not a slow operation in and of itself it will add some overhead.
Because of this Linguistic Mode is not enabled by default – you are free choose when to apply it yourself.
String Comparison Mode functions
GetDefaultStrCmpMode function
This function returns an @fm-delimited dynamic array containing the current default string comparison mode settings for the application in the format:
This function sets the default string comparison mode for an application. The mode is set to these default values for each new request made to the engine (i.e each event or web-request). This is to protect against situations where an error condition could force the engine to abort processing before the mode could be reset, thereby leaving it in an unknown state.
This function takes three arguments:
Name
Description
Mode
Specifies the default mode to set: “0” for Normal mode, or “1” for Linguistic Mode.
Flags
Bitmask integer that specifies the default flags to use when in Linguistic Mode
Locale
Specifies the name of the default locale to use.
Example:
$Insert RTI_StrCmpMode_Equates
$Insert MSWin_CompareString_Equates
// Set the default mode to Linguistic, sorting digits as numbers, using the// user's locale
SCFlags = SORT_DIGITSASNUMBERS$Call SetDefaultStrCmpMode( STRCMPMODE_LINGUISTIC$, SCFlags, "" )
GetStrCmpMode function
This function returns an @fm-delimited dynamic array containing the current string comparison mode settings for the application in the format
This function sets the current string comparison mode for an application. Note that the mode is set to the default values for each new request made to the engine (i.e each event or web-request). This is to protect against situations where an error condition could force the engine to abort processing before the mode could be reset, thereby leaving it in an unknown state.
This function takes three arguments:
Name
Description
Mode
Specifies the mode to set: “0” for Normal mode, or “1” for Linguistic Mode.
Flags
Bitmask integer that specifies the flags to use when in Linguistic Mode
Locale
Specifies the name of the locale to use.
Example:
$Insert RTI_StrCmpMode_Equates
$Insert MSWin_CompareString_Equates
// Set the mode to Linguistic, sorting digits as numbers, case-insensitive,
// and with linguistic casing, using the "en-UK" locale
SCFlags = BitOr( LINGUISTIC_IGNORECASE$, NORM_LINGUISTIC_CASING$ )
SCFlags = BitOr( SCFlags, SORT_DIGITSASNUMBERS$ )
Call SetStrCmpMode( STRCMPMODE_LINGUISTIC$, SCFlags, "en-UK" )
// Now do some sorting ...Call V119( "S", "", "A", "L", data, "" )
GetStrCmpStatus function
While it is unlikely that the CompareStringEx function will raise any errors it is possible if incompatible flags or parameters are used. In this case Windows returns an error code which may be accessed in Basic+ via this function (See the CompareStringEx documentation for more details on error values).
Example:
$Insert RTI_StrCmpMode_Equates
$Insert MSWin_CompareString_Equates
// Set the mode to Linguistic, sorting digits as numbers, case-insensitive,
// and with linguistic casing, using the "en-UK" locale
SCFlags = BitOr( LINGUISTIC_IGNORECASE$, NORM_LINGUISTIC_CASING$ )
SCFlags = BitOr( SCFlags, SORT_DIGITSASNUMBERS$ )
Call SetStrCmpMode( STRCMPMODE_LINGUISTIC$, SCFlags, "en-UK" )
// Now do some sorting ...Call V119( "S", "", "A", "L", data, "" )
SCError = GetStrCmpStatus()
If SCError Then
ErrorText = RTI_ErrorText( "WIN", SCError )
End
Conclusion
This concludes this mini-series on OpenInsight string comparison processing. Hopefully you’ll find the new Linguistic Mode useful in your own applications, bearing in mind that some of the custom sorting options, such as “Treat Digits As Numbers”, can have a use in any application beyond simply dealing with non-English language sets.
Some of the more astute readers among you may have noticed that no mention of indexing has been made so far with respect to Linguistic Mode. This is because work is currently ongoing in this part of the system, and we’ll give you more details regarding this at a later date.
Further reading
More information on this subject may be found here:
Welcome to the second part of our mini-series explaining the mechanics of how string comparisons are handled in OpenInsight. In our previous post we looked at the inner workings when running in ANSI mode – this time we’ll look at UTF8 mode instead.
(Note that we’ve included some Basic+ pseudo-code examples in this post to illustrate more clearly how some parts of the comparison routines work. These are simplifications of the actual C++ internal functions and not actual code from the system itself.)
String comparison in UTF8 Mode
In UTF8 mode characters can be multi-byte and therefore have a value greater than 255 (normally referred to as their “code point”, or in Basic+ terms, the Seq() value of a character), so this means that the standard ANSI-mode method described previously cannot be used. Instead, a slightly different approach is taken to allow higher code points to be included in custom sorting.
When the system is loaded the UTF8 library creates an internal character-map (called the “ANSI-map”) which is a 256-element array (0-255) of code-point values. This is initialized to the same values as the standard ANSI character set, i.e. position 65 will have the code point for the ANSI character with the value of 65, position 230 will have the code point for the ANSI character with the value of 230 and so on.
This ANSI-map this can be changed at runtime so that code points that are higher than 255 can be included, and code points that appear in the ANSI-map are always sorted lower than those that aren’t, regardless of their actual value. The following functions (exported from RevUTF8.dll) are used to query and update the ANSI-map:
GetAnsiToUnicode – returns the code point for a specified map element.
// MapIndex - must be an integer between 0 and 255
CodePoint = GetAnsiToUnicode( MapIndex )
SetAnsiToUnicode – updates the code point for a specified map element.
// MapIndex - must be an integer between 0 and 255// NewCodePoint - integer value of the code point to set
Call SetAnsiToUnicode( MapIndex, NewCodePoint )
UTF8 comparison method
When comparing two characters we first need to find a “sort index” for a character which is determined as follows:
Get the code point value for the character being compared.
Look in the ANSI-map using the low byte value of the code point as the index. If the value at that position is the same as the character code point then the sort index is set to that index and it is marked as “found”.
E.g. If the character has a code point value of 458 (0x1CA) then it’s low-byte value is 202 (0xCA). If the ANSI-map contains the value 458 at index 202 then the sort index is set to 202 and it is marked as “found”.
Otherwise, scan backwards through the ANSI-map looking for an element that has the same value as the code-point for the character. If we match it then the sort index is set to the same position and it is marked as “found”.
// Pseudo-code
dim ansiMap( 255 )
sortIndex = -1 ; // Not found
codePoint = seq( ch )
testIndex = bitAnd( codePoint1, 0xFF )
if ( ansiMap( testIndex ) == codePoint ) then
// Found
sortIndex = testIndex
end else
// Not found
for testIndex = 255 to 0 step -1
if ( ansiMap( testIndex ) == codePoint ) then
// Found and exit loop
sortIndex = testIndex
end
next
end
Once this has been done for both characters we use the following comparison procedure:
If one of the characters is marked as “not found” and the other as “found”, the latter is always sorted before the former.
Otherwise we now proceed in a manner similar to the ANSI comparison:
If we are using a collation sequence the sorting value for each character is extracted from the appropriate sequence using the sort index we determined above.
E.g. if the sort index was 202 then the sort value for the comparison is the value of the byte at position 203 (1-based) in the sequence.
If we are using a case-insensitive comparison without a collation sequence the two sort indexes (not values!) are masked with 0xDF and compared.
If we are using a case-sensitive comparison without a collation sequence the two original code-point values are compared.
// Pseudo-code
begin case
case ( sortIndex1 == -1 ) and ( sortIndex2 == -1 )
// Both Non-ANSI-mapped - use a simple code point compare
cmpVal = codePoint1 - codePoint2
case ( sortIndex1 == -1 )
// sortIndex2 was found in the ANSI map so it's sorted lower
cmpVal = 1
case ( sortIndex2 == -1 )
// sortIndex1 was found in the ANSI map so it's sorted lower
cmpVal = -1
case OTHERWISE$
// Both are ANSI mapped
begin case
case hasCollationSequence
sortVal1 = seq( collationSequence[sortIndex1+1,1] )
sortVal2 = seq( collationSequence[sortIndex2+1,1] )
case isCaseInsensitive
sortVal1 = bitAnd( sortIndex1, 0xDF )
sortVal2 = bitAnd( sortIndex2, 0xDF )
case OTHERWISE$
sortVal1 = codePoint1
sortVal2 = codePoint2
end case
cmpVal = sortVal1 - sortVal2
end case
So, this system works pretty well out of the box for languages that can be expressed using the ANSI character set, but for other languages much of the burden falls on the application developer to maintain and tune the language settings and collation sequences to their requirements. This could require considerable effort and ignores much of the functionality provided by the OS itself, so in the next post we’ll take a look at how this is being addressed in the next release.
Developers with systems that require Unicode processing will be please to know that the next release adds some new and much-needed string comparison functionality, to which we have given the super-catchy title of the “Linguistic String Comparison Mode”. However, before we get into the details of that, it’s worth taking a look at how string comparison is currently handled in the system as it has a huge effect on how your data is sorted, and it seems to be one of those murky and little understood areas which includes arcane terms like “collation sequences” and “ANSI Maps”.
(Note that we’ve included some Basic+ pseudo-code examples in this post to illustrate more clearly how some parts of the comparison routines work. These are simplifications of the actual C++ internal functions and not actual code from the system itself.)
String comparison in ANSI mode
String comparison in ANSI mode (i.e. where every character has a value between 0 and 255) is usually a straightforward exercise of comparing the byte value of characters against each other. However, it is possible to customize this using the classic ARev technique of “collation sequences”, which allow a developer to assign a custom weighting, or “sort value”, to a specific character when it is compared against others.
Collation sequences are contained in records in the SYSENV table with a prefix of “CM_”. By default OpenInsight includes the following “CM_” records:
CM_ANSI
CM_ISO
CM_US
A collation sequence may be attached to a language definition (“LND”) record by specifying the “CM_” key in field <10>, and when you load the LND record the collation sequence is loaded too. For example, in the German LND record (LND_GERMAN_D) you will see CM_ISO has been specified like so:
SYSENV – LND_GERMAN_D
Each CM record may contain one or two fields, each field containing a block of 256 bytes that form a collation sequence:
To give a character a weighting simply find it it’s 1-based index in the sequence, and enter the byte value you want to give it instead. For example, if we wanted to give the “3” character a sort value of “87” then we do the following:
Find the ANSI byte value of the character “3”, which is 51 (0-based value).
In field<1> we replace the character at index 52 (1-based value) with a character that has the ANSI byte value of 87, which is “W”.
Points to note:
The first field (sequence) in a CM record is required, while the case-insensitive sequence is not – the system will use a default method for case-insensitive comparison as discussed below.
A collation sequence must be 256 characters in length. If not the system will not use it.
The last two characters in the sequence (255 and 256, 1-based) are always set to the byte values of 254 and 255 (@fm and @rm).
Some LND records have data in field<11> – this is for historical reasons and may be ignored.
So, now that we know what collation sequences are, we can see how the system uses them at runtime when it needs to compare strings values.
ANSI comparison when using a collation sequence
If a collation sequence has been specified (for either a case-sensitive operation, or a case-insensitive operation) the system uses it to extract the sorting value of the character being processed:
Get the ANSI byte value of the string character being compared.
Use the ANSI byte value as an index into the appropriate collation sequence.
The sorting value for the string character is the byte value at that index.
E.g. The character “3” has an ANSI byte value of 51 – using this as an index we get byte 52 (1-based) from the collation sequence – the value of that byte is the sorting value.
ANSI case-insensitive comparison without a collation sequence
Case-insensitive comparison of two characters without a collation sequence uses the classic ASCII-style technique of bit-masking both character ANSI byte values with a value of 0xDF and then comparing the results.
That concludes this look at ANSI string comparison – in the next post we’ll take a look at how string comparison is handled in UTF8 mode, which is, as you might expect, a little more complex.
The next release of OpenInsight sees some updates to Basic+ with the addition of two new statements and the return of an old Arev compiler keyword.
The LocateC statement
A counterpart to the well-known IndexC function, the LocateC statement simply performs a case-insensitive “locate” operation on a string, using the normal Locate statement syntax like so:
LocateC substring In string Using delimiter Setting pos Then/Else
The GSClear statement
This statement clears down the internal GoSub return-stack of the currently executing program, so that a subsequent Return statement will return to the calling program rather than jump back to an originating GoSub statement as normal. This is usually used to handle severe error conditions where an “early return” to the caller is desirable and there is a need to pass a value back to the caller. E.g:
Compile Function Test( Void )
GoSub DoTheThing
Return "The Thing Was OK"
DoTheThing:
GoSub CheckTheStuff
Return
CheckTheStuff:
If TheStuffIsReallyBad Then
// Return directly to the caller
GSClear
Return "The stuff is really bad!!!"
End
Return
This is basically the same as performing an Abort operation, but allows you to return a value, which Abort does not.
The Expendable statement
Marking a program as “expendable” was a useful feature in Advanced Revelation that instantly removed a program from the cached program array after it had finished executing. This was very useful in networked development scenarios where programs being edited could be loaded by different workstations – they could still get an updated version without having to restart the application or issue a manual GarbageCollect statement to clear the cache. This is a similar scenario to using a current tool like the OpenInsight EngineServer, where individual engines can cache programs during development – it can become tedious to force them to load an updated version after making changes.
To mitigate this the Expendable keyword has been reintroduced and is simply added to the program header declaration like so:
Compile Expendable Function( Param1 )
// Do stuff
Return RetVal
or, if you don’t use the optional “Compile” keyword:
Expendable Function( Param1 )
// Do stuff
Return RetVal
With this the engine now deletes the program from the cache after all references to it have been removed from the call stack, forcing it to reload from disk the next time it is called.