Hi Michael, On 2024-06-13 14:49, Michael Meeks wrote:
AT-SPI's flows-from and flows-to relations (and ARIA's aria-flowto) seem somewhat similar to the UIA Navigation API you mention.=) Ultimately we dynamically create peers as these methods are called currently I imagine.
Yes, but that currently has issues, s. the previously mentioned tdf#96492.
If they allow consistent access to off-screen content (related: tdf#96492), they could potentially be used to retrieve the previous/next heading,...Sure; I guess the MS APIs have the problem that the interface implemented tends also to be the protocol for remote COM querying of peers whereas in Linux we can cut/de-couple that and can do better at least in theory.
IIUC, we'd still have a AT-SPI bus roundtrip for each flows-from/flows-to call, so using exclusively flows-from/flows-to to navigate through a larger document comes with a cost. On the other hand, IIUC, when using AT-SPI Collection, the bus overhead would be lower as it's just one call to the interface method, then the filtering/... happens on the application side and the result is returned in a single D-Bus reply again. (I didn't have to do with it in practice yet, though.)
E.g. it would seem odd to me if an AT starts scrolling through the document if a "go to next heading/list item" navigation command is triggered, and then e.g. goes back if it doesn't find anything, because it can't otherwise access the previously off-screen content to search for the item.I guess; but I really expect that there are keybindings and/or well known actions for expert users that are used left and right, and that in reality tracking the focused peer and interrogating it is some overwhelming majority of the use cases to the point that first making that piece really, really good, fast & complete is far more important than anything else; but perhaps I'm mistaken.
I agree that proper handling of focus is essential and having that work properly is definitely a priority.
Of course there are issues and a plan to look into those, but my impression this far is that reporting focus in Writer works decently *in general*. If you're aware of particularly pressing issues here, please don't hesitate to mention them (ideally directly in Bugzilla).
I don't think focus tracking by itself is sufficient for a really good a11y experience, though. When it comes to navigating through documents, some kind of "browse mode" that allows to navigate between objects in documents provides an experience that users are familiar with from e.g. web browsers or other office suites and is something that users have been asking for repeatedly. (LO's own navigator provides some similar functionality, but less well integrated into the screen reader.)
From a prioritization perspective; I'd really suggest working on the majority platforms for the impaired: Windows/NVDA,
Personal preferences aside, Windows/NVDA as the most widely used platform indeed generally has some priority for me, as does Writer over Calc over everything else.
There are other factors I also take into account, though, e.g. involvement/contributions from others like people working in certain areas, user requests/tickets, possibilities to cooperate (e.g. the Orca maintainer reworking Orca's LibreOffice support and providing a lot of helpful feedback and input) or productivity (my productivity on Linux is way higher than on Windows, so my take is that putting some extra initial effort in order to be able to do most of the analysis for issues *also* affecting Windows on Linux usually pays off, in particular since the platform APIs IAccessible/AT-SPI2 are fairly similar).
and vast-majority use-cases: of getting really good & complete API and feature coverage on the focused widget, before moving off into the more tricky stuff :-)
As mentioned above: If you're aware of major issues regarding support for APIs use cases for the focused widget, please don't hesitate to mention them so they can be taken into account.
But now I shut up ;-) we're working on the web side of this; caching bits in the browser and adding another protocol latency there - and I'm sure we want to be handling a reasonably bounded set of data there =)Is there an easy way to test COOL a11y web and impacts of potential changes?Ah - so; we tend to focus on the focused widget and things 'near' it - adjacent table cells etc. when populating our shadow DOM. But at some level the use-case we have for the a11y APIs is not really different than an AT would use I think.(I just opened a sample Writer doc on nextcloud.documentfoundation.org and couldn't find the doc content via Accerciser in a quick test, but am also not very familiar with web content/browser a11y.)You will want:<enable type="bool" desc="Controls whether accessibility support should be enabled or not." default="false">false</enable>Enabled in coolwsd.xml - and then to turn on screen-reading support. =)
Thanks, will try that!
Again; I'd respectfully suggest that creating APIs that make it possible to easily do things that then scale badly ultimately does a dis-service to the impaired; people quickly use them and write poorly performing ATs.A nice API for navigation and/or pre-fetching to enable linear reading through documents, and/or reading of headings etc. seems to me far more useful (and likely to perform well) - than an API that allows pre-fetch of potentially hundreds of thousands of peers - even if we don't think they will change in readonly mode =)I think AT authors will always want all of the state of the app exposed, and indeed they will want it all cached locally if they can get it: but ... probably what is really useful is providing a way to write simple, reliable, performant, context-aware, maintainable ATs easily - and IMHO the "suck all state down as a first step" thing is not that :-)
Having at least some experience working together with AT developers (particularly Orca and NVDA), I'm a bit more optimistic about being able to find good solutions together here than the above initially sounds to me, in particular when thinking about an opt-in feature that doesn't just get used "automatically" whenever any AT is active.
But of course, I fully agree that identifying suitable APIs/interaction should not be neglected as part of the process.
As a side note, my expectation is that that apart from identifying the right way to somehow make the whole document available to ATs, many aspects (or current LO shortcomings) that will show up in an effort to implement browse mode will be relevant independent of the exact API used to expose off-screen content, and addressing those issues will also be relevant for interacting with on-screen content (e.g. incomplete/incorrect support for specific a11y interfaces, as one example you mentioned earlier).
Regards, Michael
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature