Re: [a11y] LibreOffice Calc exposes 2^31 children, freezes on `GetChildren`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

thanks for sharing your thoughts!

On 2024-06-10 16:37, Michael Meeks wrote:
    Let me add my 2 cents; a spreadsheet can have 10^20 rows, and 2^14 columns - that's 34 bits already - so even in the case that you thought you wanted to iterate them all over a remote bus - you don't.

True.

    Indeed - the whole idea is madness; there was this manages-descendants hack in the past to try to tag such widgets to avoid iterating them.     My attempts to encourage people to expose only the visible (or near visible) items in the past were not that productive; but I still firmly believe this is the only sensible way to do this.

Large tables are actually the example that the AT-SPI MANAGES_DESCENDANT doc currently uses [1]:

"Used to prevent need to enumerate all children in very large containers, like tables. The presence of Atspi.StateType.MANAGES_DESCENDANTS is an indication to the client that the children should not, and need not, be enumerated by the client."



    Best of luck with this; I would really recommend that we focus on exposing only the data that is either visible - or better close to visible (ie. within a page-up/page-down / etc. around the document), with perhaps an extension of peers for eg. interesting headings in the document so these can be cached and enumerated (ie. what you see in the navigator).

Limiting children to the to "close to visible" cells sounds like a potential approach.

However, that would IMHO still need a clear specification on how to implement it and how all relevant AT use cases are covered.

Some aspects/questions that might need some further consideration:

* How do other interfaces (like AT-SPI Table, TableCell and Selection) expose information? Does e.g. the table report it only has 50 rows and 30 columns if that's what's visible on screen? Does cell Q227 report a row and column index of 0 if it's the first one in the visible area?

* In some cases, off-screen children are of interest, e.g. if they are contained in the current selection. How should that be handled? (e.g. how does the screen reader announce something like "cell A1 to C100 selected" if cell A1 "doesn't exist" because it's off-screen?

* Exposing and caching all cells based on visibility means that whenever the view port changes, this needs to be actively updated (push approach), which comes with a cost (that I can't estimate right now). (We currently have that for other modules, see e.g. comment [3] for Impress.)

* How do screen readers implement features like "read the whole row"? Do they just read the part of the row that's currently visible on screen and leave out the rest? Or do they somehow implement some extra logic to retrieve the remaining content?

* Is navigating to an "arbitrary" cell still possible via a11y API, e.g. if some screen reader specific table navigation command implements "jump to the first/last cell in the table" or "select the current row")?


As mentioned earlier, the discussion in GTK issue [2] provides some valuable insights and ideas, but doesn't answer all questions yet, and there are likely more when looking further into the details.

though of course it is then ideal to have some nice navigation API support wrapped around that

What kind of API does that refer to? Existing or new API on the platform a11y level that LO (or the toolkits it uses) would then implement, or something else? Do you have anything particular in mind?


    Oddly, Writer - which could prolly cope rather better with exposing all paragraphs set out by cropping to the visible content, whereas Calc where this was always a silly idea tried to expose everything ;-)

That's indeed unfortunate...

I've been told repeatedly that the fact that Writer doesn't expose off-screen document content is indeed a problem as it breaks features like browse mode/document navigation in NVDA or Orca (see e.g. tdf#35652, tdf#137955, tdf#91739, tdf#96492).

Exposing off-screen Writer document content is actually something I plan to look into at some point. My idea so far is to also expose pages on the a11y level, which should avoid the problem of a single object (the document) having an enormous amount of children due to that.
If there any general concerns about that, please raise them. :-)


The feedback I've received from a11y experts so far is that off-screen doc content should *generally* be exposed on the a11y level, and limiting Calc to not do that with its huge amount of table cells is meant to be an exception to the rule in that regard (see e.g. the discussion in [2] and tdf#156657).

I think it's fair to treat that specially, but (repeating myself here) my take is it needs clarity on what's the "correct" way to do that, and that's something that would IMHO ideally be clearly specified by AT and/or a11y protocol developers in a general guideline that app developers can cling to, rather than LO inventing something by itself.

If anyone has further thoughts on that, please don't hesitate to share them! :-)


[1] https://lazka.github.io/pgi-docs/Atspi-2.0/enums.html#Atspi.StateType.MANAGES_DESCENDANTS
[2] https://gitlab.gnome.org/GNOME/gtk/-/issues/6204
[3] https://gerrit.libreoffice.org/c/core/+/137622/comments/c5f34b0f_c47a1b82

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux