Re: [a11y] LibreOffice Calc exposes 2^31 children, freezes on `GetChildren`

Michael Meeks <michael.meeks@xxxxxxxxxxxxx> · Thu, 13 Jun 2024 13:49:37 +0100

Hi Michael,

On 12/06/2024 15:55, Michael Weghorn wrote:
No need to apologize - thanks a lot for your valuable input! :-)

	Ah - if you encourage me you get more ;-)

There, the Table interface also only exposes the same amount of cells as 
are exposed via the a11y tree.

	Fair enough;

     Right; so - I mentioned "near to the screen" - by near; I mean we 
will probably want a number of things that are navigationally close: 
eg. "next heading" or somesuch - to lurk around as real & tracked 
peers. The content of the Navigator headings should prolly always be 
present in a writer document's object hierarcy IMHO. That should let 
ATs very quickly enumerate headings, jump focus to them with a simple 
API etc.

That sounds interesting, but in a way also like a rather strange tree to 
me if it contains elements of some type for the whole doc, but other 
parts of the document in between are missing.

	Indeed; and yet from a caching and performance perspective - its gold 
to give ATs exactly what they want pre-fetched and cached in-process, 
and nothing more I guess; but of course fetching headings via a 
different mechanism is probably sensible.

AT-SPI's flows-from and flows-to relations (and ARIA's aria-flowto) seem 
somewhat similar to the UIA Navigation API you mention.

	=) Ultimately we dynamically create peers as these methods are called 
currently I imagine.

If they allow consistent access to off-screen content (related: 
tdf#96492), they could potentially be used to retrieve the previous/next 
heading,...

	Sure; I guess the MS APIs have the problem that the interface 
implemented tends also to be the protocol for remote COM querying of 
peers whereas in Linux we can cut/de-couple that and can do better at 
least in theory.

     Although - I'd really suggest that a11y doesn't work against the 
application, and if navigating - it should allow the AT to scroll the 
actual visible/view-port to match what is being interrogated.

Interesting thought, and maybe that could be part of the solution, if it 
becomes clearer what that can look like in practice.

	Sure; so all/most applications have in large scrolled panes a mess of 
logic to try to detect when a change moves focus, and when it moves the 
scroll-area. How you manage both of those is fraught with fun and 
unfortunate 'view jumping' ;-) in the collaborative case - consider your 
cursor moves - so you want to move the view-port to show the cursor, but 
in fact it moved because someone re-sized a spreadsheet row above and 
... ;-) anyhow; deep joy.

E.g. it would seem odd to me if an AT starts scrolling through the 
document if a "go to next heading/list item" navigation command is 
triggered, and then e.g. goes back if it doesn't find anything, because 
it can't otherwise access the previously off-screen content to search 
for the item.

	I guess; but I really expect that there are keybindings and/or well 
known actions for expert users that are used left and right, and that in 
reality tracking the focused peer and interrogating it is some 
overwhelming majority of the use cases to the point that first making 
that piece really, really good, fast & complete is far more important 
than anything else; but perhaps I'm mistaken.

     I really think that's a mistake that will ultimately hurt ATs 
performance and that we should focus on the end-user use-cases we want 
to succeed with - rather than having an abstract absolutist 
pre-conception that we can expose everything in an efficient way =)

Sure - if there's a better way to properly make the AT use cases a 
reality, then let's go that route instead. :-)

	From a prioritization perspective; I'd really suggest working on the 
majority platforms for the impaired: Windows/NVDA, and vast-majority 
use-cases: of getting really good & complete API and feature coverage on 
the focused widget, before moving off into the more tricky stuff :-)

     But now I shut up ;-) we're working on the web side of this; 
caching bits in the browser and adding another protocol latency there 
- and I'm sure we want to be handling a reasonably bounded set of data 
there =)

Is there an easy way to test COOL a11y web and impacts of potential 
changes?

	Ah - so; we tend to focus on the focused widget and things 'near' it - 
adjacent table cells etc. when populating our shadow DOM. But at some 
level the use-case we have for the a11y APIs is not really different 
than an AT would use I think.

(I just opened a sample Writer doc on nextcloud.documentfoundation.org 
and couldn't find the doc content via Accerciser in a quick test, but am 
also not very familiar with web content/browser a11y.)

	You will want:

        <enable type="bool" desc="Controls whether accessibility 
support should be enabled or not." default="false">false</enable>

	Enabled in coolwsd.xml - and then to turn on screen-reading support.

	=)

As an additional note, one more potential source to get some interesting 
insights could be to check how NVDA's browse mode is currently 
implemented for MS Word, for example.

	Indeed.

On 13/06/2024 13:27, Michael Weghorn wrote:
> I'm wondering whether one potential approach could e.g. be to provide
> different "modes" on how much Writer exposes in the a11y tree, and
> a way to switch between those....

	Lots of things are possible of course.

>  From looking a bit further into NVDA and Orca doc and some
> experimenting. It seems to me that access to the whole document
> is needed in particular in (1) structural navigation/browse mode...

	Again; I'd respectfully suggest that creating APIs that make it 
possible to easily do things that then scale badly ultimately does a 
dis-service to the impaired; people quickly use them and write poorly 
performing ATs.

	A nice API for navigation and/or pre-fetching to enable linear reading 
through documents, and/or reading of headings etc. seems to me far more 
useful (and likely to perform well) - than an API that allows pre-fetch 
of potentially hundreds of thousands of peers - even if we don't think 
they will change in readonly mode =)

	I think AT authors will always want all of the state of the app 
exposed, and indeed they will want it all cached locally if they can get 
it: but ... probably what is really useful is providing a way to write 
simple, reliable, performant, context-aware, maintainable ATs easily - 
and IMHO the "suck all state down as a first step" thing is not that :-)

	Anyhow - glad you're wrestling it not me!

	Regards,

		Michael.

--
michael.meeks@xxxxxxxxxxxxx <><, CEO Collabora Productivity
(M) +44 7795 666 147 - timezone usually UK / Europe