Hi everyone,
Thanks for such a detailed discussion. I have corrected certain parts of the PR https://gerrit.libreoffice.org/c/core/+/177364
and the 'make' build is still running from 4:46 PM.
You should specify the new chart type as it would be specified in theAdded to the Implementer Notes but have to make a more detailed blog post.
standard. That text can go to our Wiki, linked from
https://wiki.documentfoundation.org/Development/ODF_Implementer_Notes/List_of_LibreOffice_ODF_Extensions.
Writing it down helps you to become clear about functionality and helps
in writing the UNO information in the idl-file. Currently the info in
the idl file is not detailed enough. You can look at section "19.15
chart:class" in ODF 1.3.
[https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part3-schema/OpenDocument-v1.3-os-part3-schema.html]
and in the corresponding information for Excel. Search for histogram on
site:microsoft.com and look at its specification in [MS-ODRAWXML]. You
need to extend the above mentioned List_of_LibreOffice_ODF_Extensions in
any case.
(should I post that on TDF/LO Blog?)
You must extend the schema. Those changes go toDone (in the PR)
https://opengrok.libreoffice.org/xref/core/schema/libreoffice/OpenDocument-v1.4%2Blibreoffice-schema.rng.
That is missing in your patch.
The histogram chart does not belong to the charts, that are specified in
the standard. Thus it needs a value for the chart:class attribute, that
has a loext prefix, e.g. chart:class="loext:histogram". A schema change
is not needed for this value, because the data type for the value of
this attribute is already 'namespacedToken'.
You have added the 'bin' related information to the <chart:series>
element. A <chart:plot-area> element can have several <chart:series>
sub-elements. I guess, that you do not want to allow several series in
the same histogram. Excel does no allow it. Restricting it in the schema
is difficult. (Or do you have an idea, Michael?) I suggest to restrict
it in the specification text.
You export the labels for the x-axis as loext:BinRange. I would not
export them at all for these reasons:
(A) Excel does not export that information.
(B) The chart has a reference to the area of the data source in the
table. The content of this area might come from an external source, e.g.
a database engine. When the file is loaded, this data might be refreshed
and changes. Thus the bin labels and their frequency values might not
fit to the information that are put into the file when saving.
You write the 'bin' related information as attributes of the
<chart:series> element. You should consider to use one child element
instead, that contains all needed information. That way you can use a
dedicated context when loading the file. The schema would get one new
child-element for the <chart:series> element and a new section for this
new element itself. Michael, what do you think?
Still have to discuss this with Tomaz
Different variations (types) are possible for the histogram chart. You
need to specify in the text how the bins are calculated. Especially how
'automatic' works and how overflow and underflow bins influence the bin
intervals.
We are using the Scott Rule to calculate the Histogram Chart automatically, which is also used by my MSO.
chart2/source/model/template/HistogramCalculator.cxx
Here are those changes for the Underflow and Overflow calculations(I reverted these changes during the cleanup of the PR)
- Overflow Bin: Added at the end of maBinRanges and maBinFrequencies for values exceeding a threshold.
- Underflow Bin: Inserted at the beginning for values below a threshold.
You use two attributes for a underflow bin, one whether such underflow
exists and one with its value. I think that can be combined. In
implementation and schema it would be optional. The specification text
then needs to contain, what is used, when this attribute is missing.
Same for overflow. Excel has data type ST_DoubleOrAutomatic.
I have to do this
You write the new attributes with XML_NAMESPACE_CHART. It has to beCorrected
XML_NAMESPACE_LO_EXT.
You can use the histogram chart only in ODF extended. The according caseCorrected
distinctions are missing.
ODF uses for attributes and element names a style with natural language
terms separated by hyphen. Please keep this style. So instead of an
attribute loext:histogram-binwidth it should be
loext:histogram-bin-width. And instead of loext:histo it should be
loext:histogram.
On one hand you use a UNO property FrequencyType with datatype short andCorrected
possible value 0 to 3, on the other hand you assign the property value
to aFrequencies, which is a Sequence< double > ???
Excel uses for histograms the element CT_Binning (see 2.24.3.7 in
[MS-ODRAWXML]). That has the attribute intervalClosed to determine,
whether the start or end side of the bin interval is open. The
corresponding attribute is missing.
Did add in the RNG file, but have to make changes in other places too.
Regarding Kurt's and Michael's reply
I will discuss with Tomaz(Quikee) what are his thoughts about how should I approach it.
On Tue, 17 Dec 2024 at 21:46, Kurt Nordback <kurt.nordback@xxxxxxxxxxxxxx> wrote:
This bug is relevant to the question of handling multiple series in a histogram chart.
https://bugs.documentfoundation.org/show_bug.cgi?id=163713
Kurt
Sent with Proton Mail secure email.
On Monday, December 16th, 2024 at 11:02 PM, Mike Kaganski <mikekaganski@xxxxxxxxxxx> wrote:
> Hi Devansh, hi Regina,
>
> On 17.12.2024 4:51, Regina Henschel wrote:
>
> > You have added the 'bin' related information to the chart:series
> > element. A chart:plot-area element can have several chart:series
> > sub-elements. I guess, that you do not want to allow several series in
> > the same histogram. Excel does no allow it. Restricting it in the
> > schema is difficult. (Or do you have an idea, Michael?) I suggest to
> > restrict it in the specification text.
>
>
>
> I suggest to check if OOXML restricts it. Sticking to the current Excel
> behavior is reasonable as an implementation; but hardcoding the existing
> implementation detail of Excel as a standard's wording would make it
> hard to adapt, when Excel extends the implementation - it would need a
> breaking change or a new chart class. (I don't know if such an extension
> could make sense in principle, so this is just a general remark, maybe
> nonsensical in this context - sorry for that.)
>
>
> --
>
> Best regards,
>
> Mike Kaganski
--
Regards,
Devansh