Re: [core] xmloff: Histogram Chart ODF import and export support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi community,

To follow up on the previous mail, I'd like to give you a more detailed update on how we're addressing the histogram data handling issue in the chart2 module, which Tomaž and Regina pointed out.


Tomaž also said that he would like to introduce a variable XValuesCalculated which will resolve most of the problem.

Further Analysis -
  •     Original X-Values: The original X-values are correctly stored in XLabeledDataSequence.
  •     Histogram Calculations: The histogram calculations occur within the HistogramDataInterpreter and then the calculated values overwrite the XLabeledDataSequence x values.
  •     Need for Separation: We need to store and display the histogram-calculated X-values while preserving the original data as many parts of the code depend on original values.



Analysis of HistogramChartType.cxx

Purpose: This file defines the HistogramChartType class, that is responsible for setting up a histogram chart type. It manages properties specific to histograms,
like bin width and range. It also implements the logic to create the data-series and do the calculations.

createCalculatedDataSeries() Method:
  • This method retrieves the raw data through the XLabeledDataSequence from m_aDataSeries. This is where the logic modifies the source data (the sequence returned from getValues()).
  • It then uses HistogramCalculator to perform calculations and generate the data.
  • Finally, it creates a new HistogramDataSequence to store calculated values and then adds that to m_aDataSeries.

Interaction with 
XLabeledDataSequence:
  •         It gets the original data using the getDataSequences2() method of DataSeries.
  •         It gets the original x values by using getValues() method of the XLabeledDataSequence which returns an XDataSequence and then uses the getData() method of the XDataSequence to retrieve the x values in a sequence of uno::Any.


Analysis of HistogramCalculator.cxx

Purpose: This class performs the core histogram calculations using the raw data that is provided to computeBinFrequencyHistogram() method.

Bin Calculation Logic:
  •         The class calculates the number of bins, bin width, and bin ranges based on the input data.
  •         It computes bin ranges and their frequencies.
  •         The calculated values are stored internally in maBinRanges and maBinFrequencies.



Understanding HistogramDataSequence.cxx

Data Storage:
  •         The primary data storage is the mxValues member, which is a uno::Sequence<double>. This sequence stores the calculated bin values for the histogram.
  •         There is also mxLabels which stores labels but we are not concerned with it.

Interface Implementations:

  •         HistogramDataSequence implements several UNO interfaces including XNumericalDataSequence, XTextualDataSequence, XDataSequence.
  •         getNumericalData(): Returns the mxValues as a uno::Sequence<double>.
  •         getTextualData(): Returns an empty uno::Sequence<OUString>.
  •         getData(): Returns the mxValues as a uno::Sequence<uno::Any>.



Proposed Solution:
To address these issues, this is how I am thinking of making the changes:

   * Add 
XValuesCalculated Property(As said by Tomaž) :
  •         Add an [optional, attribute] sequence<doubleXValuesCalculated; to the DataSeries service definition (offapi/com/sun/star/chart2/DataSeries.idl).
  •         Implement the corresponding storage, getter, and setter methods in DataSeries.hxx and DataSeries.cxx. (Add storage for m_aXValuesCalculated)
  •         Add the corresponding property support in OPropertySet.hxx and OPropertySet.cxx?
       Why add it to DataSeries? -> If my understanding is correct, XValuesCalculated is a property associated with a DataSeries and is not a new data structure by itself, so we do not need to create a new IDL file for it.
      No Boolean Flag Needed: -> we do not need an explicit boolean flag.

       I am using .hasElements() and why Length Check is Sufficient:
  •     Presence = Calculated Values: The presence of elements within the XValuesCalculated sequence is sufficient to determine if calculated X-values exist.
  •     Empty Sequence = No Calculated Values: An empty sequence, which tells us that no values have been calculated. This is very convenient because UNO Sequences can be empty.
  •     No Redundancy: Using a separate boolean flag would add redundancy. 


   * Modify Histogram Logic:
  •         Update HistogramDataInterpreter.cxx to perform calculations and store calculated bin values in the XValuesCalculated property of the corresponding DataSeries object. This will not touch the original values in the XLabeledDataSequence.
  •         And the Histogram View utilises the BarChart view for the final rendering.

  
* Modify Rendering Logic:
  •         Modify chart renderers (such as VSeriesPlotter.cxx ?) to prioritize the XValuesCalculated property when available for plotting. When the XValuesCalculated is not available then it will use the original values of XLabeledDataSequence.

   
* Adjust Axis Scaling:
  •         Modify the code to use the XValuesCalculated property, when available, for axis scaling range calculation.

   
* Preserve Original Data in UI:
  •         Ensure that UI components always display original data fetched from XLabeledDataSequence with the role of "values-x".
  •         Since users expect the UI tables to reflect the underlying data they have provided. If we show calculated bin values in tables/cells, it will be confusing and wrong.

   
* Comprehensive Testing:
  •         Create tests to ensure that the changes work and do not cause a regression to ensure proper functionality and maintain backward compatibility.
  •         We will also make sure the undo and redo are working properly.

   
* Documentation:
  •         Document all the changes to reflect the new property and its usage.


Key Benefits:
  1.     Correct Histogram Display: Histograms will correctly use calculated bin values for the X-axis.
  2.     Preserved Original Data: Original source data will be maintained for other charts and user interactions.
  3.     Minimal Impact: This change does not affect other chart types, it ensures backward compatibility, and it only impacts histogram and renderer-specific logic.
  4.     Clear Separation: Separates the concern of raw data and calculated data.
  5.     Improved Architecture: Addresses the underlying issue of assumptions embedded in the chart(chart2) module.



Is the below file related?



Do we also have to modify getValues() in XLabeledDataSequence to return calculated values or original values based on the presence of XValuesCalculated property and chart type?

If other places need changes or I if have still missed files (please let me know)
and the changes in the offapi/com/sun/star/chart2/DataSeries.idl are correct?

On Thu, 23 Jan 2025 at 18:59, Regina Henschel (via Code Review) <gerrit@xxxxxxxxxxxxxxxxxxxxxx> wrote:
Attention is currently required from: Devansh Varshney, Tomaž Vajngerl.

Regina Henschel has posted comments on this change by Devansh Varshney. ( https://gerrit.libreoffice.org/c/core/+/177364?usp=email )

Change subject: xmloff: Histogram Chart ODF import and export support
......................................................................


Patch Set 26:

(2 comments)

Patchset:

PS26:
There is a deeper problem. You have set the calculated binFrequencies and binRanges in place of the original data. That has give you a convenient way for getting a rendering as bar chart. Not only does this cause bugs tdf#164109 and tdf#162240, but now you can't export the <chart:series> element the same way as it is done for other chart types. I'm referring to the way that can be seen after comment '// export dataseries for current chart-type' in SchXMLExport.cxx.

Currently in your patch, the <chart:series> element is not exported at all. When it will be exported, the new <loext:histogram-configuration> element has to become a child element after all the existing child elements. If the export looks similar to the existing chart types, it would be just before the comment '// close series'.

I suggest to discuss the underlying problem with Tomaž Vajngerl. I don't see an easy solution and would have to try a lot. Until this is resolved, it doesn't make sense to make progress on the export.


File schema/libreoffice/OpenDocument-v1.4+libreoffice-schema.rng:

https://gerrit.libreoffice.org/c/core/+/177364/comment/2e8b89d8_b939b3f4?usp=email :
PS26, Line 2149:         <rng:optional>
               :           <rng:ref name="loext-histogram-configuration"/>
               :         </rng:optional>
> I mean the `loext:histogram-configuration` is placed under `chart:series` is this the correct way? […]
The position is correct.



--
To view, visit https://gerrit.libreoffice.org/c/core/+/177364?usp=email
To unsubscribe, or for help writing mail filters, visit https://gerrit.libreoffice.org/settings?usp=email

Gerrit-MessageType: comment
Gerrit-Project: core
Gerrit-Branch: master
Gerrit-Change-Id: Ibda198c1a3e4a7b53e3cf590c33cc9adb558be74
Gerrit-Change-Number: 177364
Gerrit-PatchSet: 26
Gerrit-Owner: Devansh Varshney <varshney.devansh614@xxxxxxxxx>
Gerrit-Reviewer: Jenkins
Gerrit-Reviewer: Tomaž Vajngerl <quikee@xxxxxxxxx>
Gerrit-CC: Regina Henschel <rb.henschel@xxxxxxxxxxx>
Gerrit-Attention: Devansh Varshney <varshney.devansh614@xxxxxxxxx>
Gerrit-Attention: Tomaž Vajngerl <quikee@xxxxxxxxx>
Gerrit-Comment-Date: Thu, 23 Jan 2025 13:29:36 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: Devansh Varshney <varshney.devansh614@xxxxxxxxx>


--
Regards,
Devansh

[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux