Bug 112950

Summary: PDF: Hebrew characters overlapping or very close together with David CLM font
Product: LibreOffice Reporter: Eyal Rozenberg <eyalroz1>
Component: Printing and PDF exportAssignee: Not Assigned <libreoffice-bugs>
Status: RESOLVED WORKSFORME    
Severity: normal CC: iorsh, kaplanlior, philipz85, xiscofauli
Priority: medium Keywords: bibisectRequest, regression
Version: 5.4.2.2 release   
Hardware: All   
OS: Linux (All)   
Whiteboard:
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 103378, 112812    
Attachments: ODT document whose PDF export gets messed up
PDF export result
3-page odt test doc
5.3 PDF vs 5.4 PDF
close comparison 5.3 vs 5.4
close comparison 5.3 vs 5.4

Description Eyal Rozenberg 2017-10-06 19:47:25 UTC
Created attachment 136814 [details]
ODT document whose PDF export gets messed up

I'm using LO writer 5.4.2.2 release on Linux Mint 18.2.

I have an ODT document which has (probably) been edited by MS Word at some point. When I export it to PDF, some of the Hebrew letters overlap each other, some don't. On the second page, after having pressed the numbered list toolbar button, the PDF export renders the text almost properly, but not quite - sapcing is off. Specifically, notice the lack of distance between the rightmost Heh (ה) and Ain (ע) on the top line on page 2.

Finally, when I remove the end of the paragraph and keep most of the first line - on page 3 - the rendering looks just fine.

I suspect this is not a recent regression, since I recall having experienced this with previous versions - but now I got annoyed enough to spend the time creating a small manifesting example and reporting it.
Comment 1 Eyal Rozenberg 2017-10-06 19:47:52 UTC
Created attachment 136815 [details]
PDF export result
Comment 2 Eyal Rozenberg 2017-10-06 19:48:54 UTC
Note you'll need the David CLM font. You can get it here:

https://sourceforge.net/projects/culmus/files/culmus/0.131/culmus-0.131.tar.gz/download
Comment 3 Xisco Faulí 2017-10-07 08:10:00 UTC
You can't confirm your own bugs. Moving it back to UNCONFIRMED until someone else confirms it.
Comment 4 Eyal Rozenberg 2017-10-07 17:07:27 UTC
On this thread:
https://whatsup.org.il/index.php?name=PNphpBB2&file=viewtopic&p=421925#421925

Several people are confirming it while others do not see it happening with their versions. I'm hoping some of them would come over here to confirm...
Comment 5 Lior Kaplan 2017-10-12 10:16:32 UTC
Confirmed with LibreOffice 5.4.1 on Debian 64bit.

Seems to happen only with Culmus fonts. CCing Maxim Iorsh for his opinion on this (as their creator).
Comment 6 Yousuf Philips (jay) (retired) 2017-10-12 21:49:45 UTC
Created attachment 136938 [details]
3-page odt test doc

As attachment 136814 [details] only has 2 pages and the exported pdf in attachment 136815 [details] has 3, i created this 3 page test document.
Comment 7 Yousuf Philips (jay) (retired) 2017-10-12 21:56:45 UTC
The overlapping characters is clearly a regression introduce in the 5.4 cycle. The close Heh and Ain characters varied in 5.3 and 5.4 based on whether it is in a numbered list or not, 5.3 but in my tests never was as close as in attachment 136815 [details]. Tested on Linux Mint 18.0.

Version: 5.3.7.0.0+
Build ID: a562be54f3127f4e22a3a38e62db2b38d48499f3
CPU Threads: 2; OS Version: Linux 4.4; UI Render: default; VCL: gtk2; Layout Engine: new; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:libreoffice-5-3, Time: 2017-09-19_03:52:04
Locale: en-US (en_US.UTF-8); Calc: group

Version: 5.4.3.0.0+
Build ID: fb64cf127dc6398f5d18d186a93966837db0bb1e
CPU threads: 2; OS: Linux 4.4; UI render: default; VCL: gtk2; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:libreoffice-5-4, Time: 2017-09-27_12:54:32
Locale: en-US (en_US.UTF-8); Calc: group
Comment 8 Yousuf Philips (jay) (retired) 2017-10-12 21:57:25 UTC
Created attachment 136939 [details]
5.3 PDF vs 5.4 PDF
Comment 9 Eyal Rozenberg 2017-10-13 16:54:14 UTC
Also note that some letters are spaced slightly too far apart. I wonder if there isn't some kind of "index-is-off" issue here in accessing the amount of horizontal space necessary per character.
Comment 10 Maxim Iorsh 2017-10-13 20:47:56 UTC
Created attachment 136964 [details]
close comparison 5.3 vs 5.4

Looking at the PDF comparison, it looks like certain groups of letters are shifted right, leaving their surrounding intact - see attached "5.3 vs 5.4 close comparison" image (made from jay's PNG)

The shifted groups (first line only) are
 * ayn-resh
 * lamed-yod
 * nun-vav-gimel-ayn
 * lamed-alef-vav-pe
 * he-lamed-vav
 * samech-pe-yod
 * het-shin-bet-vav-nun-vav
 * bet-alef
 * alef-het

Honestly, I can't see any logic in this collection.
Comment 11 Maxim Iorsh 2017-10-13 20:54:32 UTC
Created attachment 136965 [details]
close comparison 5.3 vs 5.4

Shifted groups underlined with blue
Comment 12 zdevir 2017-10-14 12:57:51 UTC
Confirmed both Linux (latest 5.4 RC) and Windows (5.4.1.2). Problem occurs with all fonts, including David and Narkisim. I guess it has something to do with the internal representation of the text.
Comment 13 Eyal Rozenberg 2017-10-14 14:04:33 UTC
(In reply to zdevir from comment #12)
> Problem occurs with all fonts, including David and Narkisim.

All Culmus fonts, you mean? Or have you seen this with other Hebrew fonts?
Comment 14 Omer Zak 2017-11-20 20:03:13 UTC
The problem was not reproduced in:

Version: 6.0.0.0.alpha1+
Build ID: 9050854c35c389466923f0224a36572d36cd471a
CPU threads: 8; OS: Linux 4.9; UI render: default; VCL: gtk3; 
Locale: en-US (en_US.utf8); Calc: group

OS: Debian 64bit Stretch (Debian 9.2, with some backported packages)

The reported font for the document's text was David CLM.
Comment 15 Xisco Faulí 2017-11-20 20:06:28 UTC
(In reply to Omer Zak from comment #14)
> The problem was not reproduced in:
> 
> Version: 6.0.0.0.alpha1+
> Build ID: 9050854c35c389466923f0224a36572d36cd471a
> CPU threads: 8; OS: Linux 4.9; UI render: default; VCL: gtk3; 
> Locale: en-US (en_US.utf8); Calc: group
> 
> OS: Debian 64bit Stretch (Debian 9.2, with some backported packages)
> 
> The reported font for the document's text was David CLM.

Duplicate of bug 113428?
@Yousuf, what do you think ?
Comment 16 Yousuf Philips (jay) (retired) 2017-11-21 09:53:19 UTC
(In reply to Xisco Faulí from comment #15)
> Duplicate of bug 113428?
> @Yousuf, what do you think ?

Khalid fixed that bug on the 8th and if this bug is still showing up with Omer's build from the 13th then it wouldnt be a duplicate.

This bug needs to be bibisected to know where the issue first arose.
Comment 17 Eyal Rozenberg 2017-11-21 14:50:09 UTC
(In reply to Yousuf Philips (jay) from comment #16)
> (In reply to Xisco Faulí from comment #15)
> > Duplicate of bug 113428?

Doesn't look like a proper dupe. In  bug 113428 characters fully overlap, or worse; with this bug, it's more of a spacing issue.

Also, 113428 does not manifest with 5.4.

Of course... it's not impossible that fixing that one somehow affected this one but I really can't say. And they're both about horizontal placement of glyphs on a line, so they're obviously related.
Comment 18 Xisco Faulí 2017-11-24 23:40:49 UTC
My question after reading comment 14 is, is this issue fixed on master? then we can close it as RESOLVED WORKSFORME...
Comment 19 Eyal Rozenberg 2018-01-07 13:12:58 UTC
Let's just say it was fixed somehow.