Bug 101626

Summary: Hyphen missing in pdf export
Product: LibreOffice Reporter: Oliver Sander <oliver.sander>
Component: Printing and PDF exportAssignee: Not Assigned <libreoffice-bugs>
Status: RESOLVED FIXED    
Severity: normal CC: ilmari.lauhakangas, klasse
Priority: medium Keywords: filter:docx, filter:pdf
Version: Inherited From OOo   
Hardware: All   
OS: All   
Whiteboard: target:5.4.0
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 103378    
Attachments: Original docx document
pdf file as created by MS Office
pdf file as created by LibreOffice 5.2.0.4
.doc demo hack patch

Description Oliver Sander 2016-08-20 19:50:31 UTC
Created attachment 126919 [details]
Original docx document

I have a docx file with a short text, which starts with a hyphen.  The hyphen appears when opening the file, but it is missing when exporting the text to a pdf file.
Comment 1 Oliver Sander 2016-08-20 19:52:01 UTC
Created attachment 126920 [details]
pdf file as created by MS Office
Comment 2 Oliver Sander 2016-08-20 19:52:38 UTC
Created attachment 126921 [details]
pdf file as created by LibreOffice 5.2.0.4
Comment 3 Buovjaga 2016-09-20 19:16:56 UTC
Repro.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.3.0.0.alpha0+
Build ID: 5c8ad526447934a5eae94fa5f40584083a874d9f
CPU Threads: 8; OS Version: Linux 4.7; UI Render: default; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on September 19th 2016

Arch Linux 64-bit
LibreOffice 3.3.0 
OOO330m19 (Build:6)
tag libreoffice-3.3.0.4
Comment 4 Oliver Sander 2016-10-21 18:04:03 UTC
Reproduced in

LibreOfficeDev 5.3.0.0.alpha1 f4ca1573fcf445164c068c1046ab5d084e1b005f
Debian Testing / Mate Desktop
Comment 5 Caolán McNamara 2017-01-26 11:12:31 UTC
Created attachment 130694 [details]
.doc demo hack patch

Word uses 0x1f for its soft-hyphens, writer uses 0xAD, the source document uses 0xAD which in word is "just a char" but for us is a non-printing soft hyphen. One solution might be to transform it on import to 0x2010. The attached does that (not quite completely safe in this version) for the .doc format, not sure where he docx equivalent lives
Comment 6 Johnny_M 2017-02-01 18:59:20 UTC
Fixed for LO 5.4 (current master) in https://gerrit.libreoffice.org/33707 which isn't shown here for some reason. Thanks!

Could someone please cherry-pick this for LO 5.3 and 5.2 in Gerrit? It errors out with "wrong author" in if I try.
Comment 7 Adolfo Jayme Barrientos 2017-02-03 17:25:25 UTC
(In reply to Johnny_M from comment #6)
> Fixed for LO 5.4 (current master) in https://gerrit.libreoffice.org/33707
> which isn't shown here for some reason. Thanks!

It isn’t shown here because that patch has not been merged; it’s still under review.
Comment 8 Commit Notification 2017-02-06 22:06:48 UTC
Patrick J committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b322fbc548479e09db8a437764a4089652ffbca2

tdf#101626: replace soft-hyphen by hard-hyphen in ooxml import

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Buovjaga 2017-02-07 13:20:25 UTC
Ok, now I see the hyphen in PDF export.

Win 7 Pro 64-bit Version: 5.4.0.0.alpha0+
Build ID: 83e059af2203ec0cd15dea08cfa538555ba14bd7
CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-02-06_23:34:43
Locale: fi-FI (fi_FI); Calc: group