Author Topic: Sibiac: Single Image Blob Interface Accessible Control  (Read 431 times)

Offline azslow3

  • Administrator
  • Hero Member
  • *****
  • Posts: 1034
Sibiac: Single Image Blob Interface Accessible Control
« on: July 03, 2017, 05:10:08 PM »
Early prototype, pre-alpha quality.

How it works
Many applications, especially most music applications, for the reason I have analyzed in another post "Accessibility in applications, my point of view", are not accessible at all. They can not be controlled with keyboard and they do not expose any text to Screen Readers.

There are several concepts already used in general and application specific screen reader add-ons. It is possible to bind some keyboard shortcut to mouse clicking at specific position. It is possible till some degree find required position by analyzing colors on the screen. It is possible to grab a part of the screen and use OCR to extract text from it.

What I have not found is any attempt to combine all these methods for extracting sufficient information from the image and reconstruct complete interface element.
For example a button on the screen is some area with text or image, sometimes with color indication about its current state. Taking that colors, grabbing corresponding image area and feeding OCR, it should be possible to build the information available from accessible button directly. Clicking on that screen coordinates is the equivalent of pressing the button.  Scanning colors in a list box it should be possible to find the position of current active element, then grab it and feed OCR, click on it or on the next/previous element, imitating accessible list. And so on.

With implementation for required user interface primitives, it should be possible to build accessibility information just from the image of the interface. And so make any program accessible, in ideal case even from real-time webcam video of the monitor. I mean completely independent from any accessibility level provided or not provided by the application and OS.

While that sounds like a plan, that approach has two difficult aspects:
1. how to find user interface element in the image? In other words, how to understand there is some button and then there is a list? While humans are smart and interface developers normally do not try to puzzle users, the answer on such questions are sometimes not trivial even for sighted experienced person. That is very interesting topic for image processing and pattern recognition students, yet I have not found published works in that direction (may be I have to search more). I have decided to bypass that for the moment. Fortunately, for music applications I currently target, the interface is fixed size dialogs. So looking  at the interface, i extract that information and manually write into particular application configuration.
2. how to work with particular control? Some controls, like mentioned button, are rather strait forward. Other are way more tricky. Almost any new type of control is a challenge, but I am slowly moving on.

Sibiac
Sibiac is an add-on for NVDA. I am not experienced NVDA programmer, the concept is not really fit into NVDA framework and the whole thing is at early development stage. So my code more fighting with NVDA then cooperating.

Current version has absolutely no value for anyone without Cakewalk Sonar X3, since included interfaces are: Native Instruments Guitar Rig 5 and Absynth 5, Cakewalk Session Instruments Strings, Piano, Bass and Drums, Cakewalk Sound Center, Cakewalk Dimention pro and Cakewalk Session Drummer 3. Except first two, only when used from within Sonar.

But work is in progress. And all  suggestions, comments or help (testing, helping with better NVDA integration, etc.) are welcome.

Current version download: http://azslow.com/files/sibiac.0.8.nvda-addon

Licenses
The package includes binary version of Tesseract OCR, compiled using GCC. Corresponding license files can be found in the tesseract subdirectory

Binary proxy library libsibiac.dll except CRC32 code is currently under Apache License 2.0, the source code can be provided on request

sibiact.exe is provided "as is" WITHOUT ANY WARRANTY and at the moment without the source code. It is not essential for this add-on operations but can be helpful during new applications definitions

All other files are covered by the GNU General Public License (Version 2).
« Last Edit: August 10, 2017, 07:51:16 PM by azslow3 »

Offline crozell25

  • Newbie
  • *
  • Posts: 3
Re: Sibiac: Single Image Blob Interface Accessible Control
« Reply #1 on: July 04, 2017, 03:40:03 PM »
A screen reader I have used is Autoit. it is able to detect color and images and is programmable. I do not know all of the programming capabilities but it is excellent for automation.

Offline azslow3

  • Administrator
  • Hero Member
  • *****
  • Posts: 1034
Re: Sibiac: Single Image Blob Interface Accessible Control
« Reply #2 on: July 04, 2017, 09:01:50 PM »
Thank you for the comment. If I understand correctly, Autoit is not a Screen Reader. But some used technic in it can give me fresh ideas.

Offline crozell25

  • Newbie
  • *
  • Posts: 3
Re: Sibiac: Single Image Blob Interface Accessible Control
« Reply #3 on: July 05, 2017, 09:18:08 AM »
While on the surface it may not seem like a screen reader it is capable of looking at a window and searching for images. When image is found a command can be executed.
Since I commented I realized that with music software latency is an issue. Autoit is basically a loop based script the less latency you define by smaller delays in between script commands the increase in cpu resources can become rather significant. 
Originally it was developed to install software automatically. When an image was found on the screen the script knew what command to proceed to next.
Hopefully what I said makes sense.


Offline azslow3

  • Administrator
  • Hero Member
  • *****
  • Posts: 1034
Re: Sibiac: Single Image Blob Interface Accessible Control
« Reply #4 on: July 05, 2017, 09:52:51 AM »
In fact I could not understand the problem till I installed NVDA. It is free and support portable installation (without bloating the system). For anyone with monitor the situation is hard to realize (see my general post). Latency and CPU are not an issue, it is not about speed, it is about theoretical possibility.

All screen readers have advanced scripting with a huge set of information extraction features. For some reason they have decided to "stop" on non image information, except manual OCR call for the whole element (f.e. in browsers). I do not exclude relatively trivial explanation, the development is driven by people who have never seen the screen, for them that is hard to imagine what picture interface really is. For someone who could see before, the idea is almost trivial. That is how this project was born.

And that is why originally sight based tools like Autoit can contribute. Even in case they have useful features, they was not considered by "mainstream" accessibility developers.
But that work till some degree. Primary purpose of a screen reader is to describe current interface state to the user (in strictly text information) and allow to decide what to do as next. I mean not simple "when - then" behavior.


Offline crozell25

  • Newbie
  • *
  • Posts: 3
Re: Sibiac: Single Image Blob Interface Accessible Control
« Reply #5 on: July 05, 2017, 09:57:29 AM »
I read your post again and I understand what you mean by a screen reader. So no autoit doesn't read the screen. I was thinking of a screen reader as a program to look at the screen find a control that is not capable of being assigned to a midi controller via a screen grab.
This is the direction I was going toward since I have software that is not able to be controlled via midi or osc. The knobs are on the screen and can be manipulated with the mouse but as far as I can tell, without exhausting my time , they cannot be controlled otherwise.
Another point you talked on was colors. I have a color deficiency myself which renders me unable to interpret most shades of the same color.
I commend your your work in the community and thank you for your plugin.

Offline azslow3

  • Administrator
  • Hero Member
  • *****
  • Posts: 1034
Re: Sibiac: Single Image Blob Interface Accessible Control
« Reply #6 on: July 05, 2017, 01:41:12 PM »
Yes, the primary target for this board is "Accessibility".

But the methods can be shifted from one world to the other. I am thinking about adding "clicking" to my AZ Controller. With many Synthes I have mapped MIDI controls to normal keyboard arrows and enter, but I still have to click first on required region to choose the preset. Unlike you, I do not use VST with mouse only parameters, but that can be the "next level" in the same direction.

Coming from accessibility world back, there are already useful features:
* in AZ Controller that is an audition. F.e. I (and some other) have found it useful to hear markers when they are passed during playback, much less distracting then looking at screen while playing with the question "hmm... which verse that is going to be?".
* MarKo has written an utility to convert AZ Controller preset into text file, parsing GUI where preset is loaded. I must admit, I have no skills to do this in reasonable time

That is why I think interchanging technologies between accessibility and visual worlds can have benefits for both sides.