Bug 122192 - Converting docx in headless mode hangs
Summary: Converting docx in headless mode hangs
Status: RESOLVED INSUFFICIENTDATA
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
5.3 all versions
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:docx
Depends on:
Blocks: Performance CPU-AT-100%
  Show dependency treegraph
 
Reported: 2018-12-19 10:24 UTC by rb
Modified: 2024-04-25 06:39 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Report posted on Libreoffice forums (2.85 KB, text/plain)
2018-12-19 10:25 UTC, rb
Details
Broken DOCX document (158.27 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2019-01-10 12:21 UTC, rb
Details

Note You need to log in before you can comment on or make changes to this bug.
Description rb 2018-12-19 10:24:18 UTC
Description:
My application runs on a Ubuntu 16.04 web server where uploaded files automatically get's converted to PDF with doc2pdf which is part of unoconv which again uses Libreoffice in headless mode. When trying to convert the corrupted DOCX document hangs with 100% of the CPU utilized and eventually I have to reboot to recover.

Steps to Reproduce:
1. Have a Word document (DOCX) that is corrupted
2. Try to convert it to PDF: libreoffice --headless --convert-to pdf broken.docx
3. Trying with a Word document that is not corrupted works fine

Actual Results:
When trying to convert the corrupted DOCX document hangs with 100% of the CPU utilized and eventually I have to reboot to recover:

javaldx: Could not find a Java Runtime Environment!
Warning: failed to read path from javaldx
W: Unknown node under /registry/extlang: deprecated
W: Unknown node under /registry/grandfathered: comments
W: Unknown node under /registry/grandfathered: comments
Fontconfig warning: ignoring UTF-8: not a valid region tag
convert /home/forge/broken.docx -> /home/forge/broken.pdf using filter : writer_pdf_Export

Expected Results:
Command exits to shell with an error.


Reproducible: Always


User Profile Reset: No



Additional Info:
I would suggest one of these things would happen:

1. Command exits with an error
2. Set a timeout and if reached, the command
3. Be able to detect if DOCX document is broken

Unfortunately I cannot provide you with the broken Word document because it contains sensitive information. Trying to censor the sensitive information would require me to create an new document that is not corrupted.

Originally the question was asked here (and will attach a text version to this bug report):

https://ask.libreoffice.org/en/question/174451/converting-docx-in-headless-mode-hangs/
Comment 1 rb 2018-12-19 10:25:33 UTC
Created attachment 147666 [details]
Report posted on Libreoffice forums
Comment 2 Roman Kuznetsov 2018-12-21 08:00:02 UTC
please attach your corrupted DOCX file. You can erase all sensitive information by replace all symbols to letter "a"
Comment 3 rb 2019-01-10 12:21:49 UTC
Created attachment 148213 [details]
Broken DOCX document

Since DOCX is "just" a zip file a managed to unzip it, remove the sensitive data and the zip it again while keeping the broken state.
Comment 4 Buovjaga 2019-07-13 14:23:43 UTC
Doesn't hang for me with headless conversion of attachment 148213 [details]

Can you try with 6.2.x or with 6.3, which is in release candidate state?

Set to NEEDINFO.
Change back to UNCONFIRMED, if the problem persists. Change to RESOLVED WORKSFORME, if the problem went away.

Arch Linux 64-bit
Version: 6.4.0.0.alpha0+
Build ID: 1ce1c26dd98e6477139e08d1ebe89fa950ff5fb0
CPU threads: 8; OS: Linux 5.2; UI render: default; VCL: gtk3; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 12 July 2019
Comment 5 QA Administrators 2020-01-10 04:11:34 UTC Comment hidden (obsolete)
Comment 6 QA Administrators 2020-02-10 03:33:38 UTC
Dear rb,

Please read this message in its entirety before proceeding.

Your bug report is being closed as INSUFFICIENTDATA due to inactivity and
a lack of information which is needed in order to accurately
reproduce and confirm the problem. We encourage you to retest
your bug against the latest release. If the issue is still
present in the latest stable release, we need the following
information (please ignore any that you've already provided):

a) Provide details of your system including your operating
   system and the latest version of LibreOffice that you have
   confirmed the bug to be present

b) Provide easy to reproduce steps – the simpler the better

c) Provide any test case(s) which will help us confirm the problem

d) Provide screenshots of the problem if you think it might help

e) Read all comments and provide any requested information

Once all of this is done, please set the bug back to UNCONFIRMED
and we will attempt to reproduce the issue. Please do not:

a) respond via email 

b) update the version field in the bug or any of the other details
   on the top section of our bug tracker

Warm Regards,
QA Team

MassPing-NeedInfo-FollowUp