Hi Michael,
On 12/06/2024 15:55, Michael Weghorn wrote:
No need to apologize - thanks a lot for your valuable input! :-)
Ah - if you encourage me you get more ;-)
There, the Table interface also only exposes the same amount of cells as
are exposed via the a11y tree.
Fair enough;
Right; so - I mentioned "near to the screen" - by near; I mean we
will probably want a number of things that are navigationally close:
eg. "next heading" or somesuch - to lurk around as real & tracked
peers. The content of the Navigator headings should prolly always be
present in a writer document's object hierarcy IMHO. That should let
ATs very quickly enumerate headings, jump focus to them with a simple
API etc.
That sounds interesting, but in a way also like a rather strange tree to
me if it contains elements of some type for the whole doc, but other
parts of the document in between are missing.
Indeed; and yet from a caching and performance perspective - its gold
to give ATs exactly what they want pre-fetched and cached in-process,
and nothing more I guess; but of course fetching headings via a
different mechanism is probably sensible.
AT-SPI's flows-from and flows-to relations (and ARIA's aria-flowto) seem
somewhat similar to the UIA Navigation API you mention.
=) Ultimately we dynamically create peers as these methods are called
currently I imagine.
If they allow consistent access to off-screen content (related:
tdf#96492), they could potentially be used to retrieve the previous/next
heading,...
Sure; I guess the MS APIs have the problem that the interface
implemented tends also to be the protocol for remote COM querying of
peers whereas in Linux we can cut/de-couple that and can do better at
least in theory.
Although - I'd really suggest that a11y doesn't work against the
application, and if navigating - it should allow the AT to scroll the
actual visible/view-port to match what is being interrogated.
Interesting thought, and maybe that could be part of the solution, if it
becomes clearer what that can look like in practice.
Sure; so all/most applications have in large scrolled panes a mess of
logic to try to detect when a change moves focus, and when it moves the
scroll-area. How you manage both of those is fraught with fun and
unfortunate 'view jumping' ;-) in the collaborative case - consider your
cursor moves - so you want to move the view-port to show the cursor, but
in fact it moved because someone re-sized a spreadsheet row above and
... ;-) anyhow; deep joy.
E.g. it would seem odd to me if an AT starts scrolling through the
document if a "go to next heading/list item" navigation command is
triggered, and then e.g. goes back if it doesn't find anything, because
it can't otherwise access the previously off-screen content to search
for the item.
I guess; but I really expect that there are keybindings and/or well
known actions for expert users that are used left and right, and that in
reality tracking the focused peer and interrogating it is some
overwhelming majority of the use cases to the point that first making
that piece really, really good, fast & complete is far more important
than anything else; but perhaps I'm mistaken.
I really think that's a mistake that will ultimately hurt ATs
performance and that we should focus on the end-user use-cases we want
to succeed with - rather than having an abstract absolutist
pre-conception that we can expose everything in an efficient way =)
Sure - if there's a better way to properly make the AT use cases a
reality, then let's go that route instead. :-)
From a prioritization perspective; I'd really suggest working on the
majority platforms for the impaired: Windows/NVDA, and vast-majority
use-cases: of getting really good & complete API and feature coverage on
the focused widget, before moving off into the more tricky stuff :-)
But now I shut up ;-) we're working on the web side of this;
caching bits in the browser and adding another protocol latency there
- and I'm sure we want to be handling a reasonably bounded set of data
there =)
Is there an easy way to test COOL a11y web and impacts of potential
changes?
Ah - so; we tend to focus on the focused widget and things 'near' it -
adjacent table cells etc. when populating our shadow DOM. But at some
level the use-case we have for the a11y APIs is not really different
than an AT would use I think.
(I just opened a sample Writer doc on nextcloud.documentfoundation.org
and couldn't find the doc content via Accerciser in a quick test, but am
also not very familiar with web content/browser a11y.)
You will want:
<enable type="bool" desc="Controls whether accessibility
support should be enabled or not." default="false">false</enable>
Enabled in coolwsd.xml - and then to turn on screen-reading support.
=)
As an additional note, one more potential source to get some interesting
insights could be to check how NVDA's browse mode is currently
implemented for MS Word, for example.
Indeed.
On 13/06/2024 13:27, Michael Weghorn wrote:
> I'm wondering whether one potential approach could e.g. be to provide
> different "modes" on how much Writer exposes in the a11y tree, and
> a way to switch between those....
Lots of things are possible of course.
> From looking a bit further into NVDA and Orca doc and some
> experimenting. It seems to me that access to the whole document
> is needed in particular in (1) structural navigation/browse mode...
Again; I'd respectfully suggest that creating APIs that make it
possible to easily do things that then scale badly ultimately does a
dis-service to the impaired; people quickly use them and write poorly
performing ATs.
A nice API for navigation and/or pre-fetching to enable linear reading
through documents, and/or reading of headings etc. seems to me far more
useful (and likely to perform well) - than an API that allows pre-fetch
of potentially hundreds of thousands of peers - even if we don't think
they will change in readonly mode =)
I think AT authors will always want all of the state of the app
exposed, and indeed they will want it all cached locally if they can get
it: but ... probably what is really useful is providing a way to write
simple, reliable, performant, context-aware, maintainable ATs easily -
and IMHO the "suck all state down as a first step" thing is not that :-)
Anyhow - glad you're wrestling it not me!
Regards,
Michael.
--
michael.meeks@xxxxxxxxxxxxx <><, CEO Collabora Productivity
(M) +44 7795 666 147 - timezone usually UK / Europe