Bug 75930

Summary: IMPORT MathML: some characters are missing
Product: LibreOffice Reporter: Mike Kaganski <mikekaganski>
Component: Formula EditorAssignee: Not Assigned <libreoffice-bugs>
Status: RESOLVED FIXED    
Severity: normal CC: jalojo, marcos.souza.org, serval2412, xiscofauli
Priority: medium    
Version: 4.2.0.0.beta1   
Hardware: Other   
OS: All   
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 118765    
Attachments: ZIP with problematic MML and screenshot
Actual encoding (LO 4.2.2.1)

Description Mike Kaganski 2014-03-09 01:06:54 UTC
Created attachment 95380 [details]
ZIP with problematic MML and screenshot

Importing the MML file from attachment skips some characters.
Seems that they are characters that immediately precede numeric character references (&#xXXXX;).

Also, in the attachment there is a screenshot displaying the current state of import, and the expected result. Problematic places are marked.

Tested with 4.2.0.0.beta1-4.2.2.1 under Win7x64, and 4.2.1.1 under Ubuntu 13.10 x64. Previous versions couldn't handle this file at all.

Marcos, adding you to CC as you suggested in bug 59642, because you are the expert in this area. Please excuse me if it's a wrong thing to do.
Comment 1 Jacques Guilleron 2014-03-10 11:05:09 UTC
Hi Mike,

Reproduced with LO 4.2.2.1 and LO 4.3.0.0.alpha0+
Build ID: 7122ef19847b26529ed1d5bad40df869e91a8495
TinderBox: Win-x86@39, Branch:master, Time: 2014-03-06_00:38:21
& Windows 7 Home Premium.
Confirm also that this file cannot be opened with LO 3.6.6.2
Add for comparison the actual encoding (LO 4.2.2.1) for correct displaying.

Set status to NEW.

Kind regards,

Jacques
Comment 2 Jacques Guilleron 2014-03-10 11:07:02 UTC
Created attachment 95498 [details]
Actual encoding (LO 4.2.2.1)
Comment 3 QA Administrators 2016-02-21 08:35:22 UTC Comment hidden (obsolete)
Comment 4 Mike Kaganski 2016-02-22 08:57:10 UTC
Still reproducible with 5.1.0.3
Comment 5 QA Administrators 2017-03-06 15:13:58 UTC Comment hidden (obsolete)
Comment 6 Mike Kaganski 2017-03-06 20:31:58 UTC
reproducible with 5.3.1.1.
Comment 7 QA Administrators 2018-03-07 03:41:16 UTC Comment hidden (obsolete)
Comment 8 Mike Kaganski 2018-03-07 04:22:43 UTC
Still present in Version: 6.0.2.1 (x64)
Build ID: f7f06a8f319e4b62f9bc5095aa112a65d2f3ac89
CPU threads: 4; OS: Windows 10.0; UI render: GL; 
Locale: ru-RU (ru_RU); Calc: CL
Comment 9 QA Administrators 2019-07-15 02:48:29 UTC Comment hidden (obsolete)
Comment 10 Julien Nabet 2020-08-01 14:37:28 UTC
On pc Debian x86-64 with master sources updated today, I could open second attachment (fa.mml).

About first one which contains f.mml, it displays:
OT instead of italic OM but don't know if it's ok.

Any update here with LO 6.4.5?
Comment 11 dante19031999 2020-11-12 22:52:17 UTC
The bug is real. Lo has no &HEX; support for mathml. Only accepts unicode first character of the string.
Comment 12 Julien Nabet 2020-11-13 06:25:42 UTC
Dante: let's put this one to ASSIGNED since you assigned yourself.
Comment 13 Xisco FaulĂ­ 2022-05-02 14:39:55 UTC
Dear Dante,
This bug has been in ASSIGNED status for more than 3 months without any
activity. Resetting it to NEW.
Please assign it back to yourself if you're still working on this.
Comment 14 QA Administrators 2024-05-02 03:15:23 UTC Comment hidden (obsolete)
Comment 15 Mike Kaganski 2024-05-02 05:56:41 UTC
Fixed in 6.1 by commit bf46b46a1d734348096936284fb8a76e977936d0 (Moving XSAXDocumentBuilder2 to use XFastDocumentHandler:, 2018-03-14)