Skip to main content

Prepare MS Word Document for Import

To import content from Microsoft Word into Paligo, you will need to apply certain styles to your content and make some adjustments. This is because MS Word is unstructured content, and to import it into Paligo, it needs to be given some form of structure.

We recommend that you save a copy of your Word file for use as a Word document. Then use another copy for the Paligo import.

To prepare your MS Word file for import, complete the following tasks.

We recommend that you give your MS Word document a title. To create the title, add your title text and apply the Title style to it. For more information on using styles, see the official Microsoft Word documentation.

Microsoft-word-title.png

If you try to import a Word document that does not have a title, Paligo will apply a default title, such as "Article d3e1".

In some cases, MS Word documents that have no title may cause the import process to fail. If this happens, you may see the following error message:

"The import file could not be created. Nothing to import. The intermediate transformation resulted in an empty document."

Paligo uses the style names from MS Word to map your Word content to the appropriate elements in Paligo. For example, Paligo uses the Heading 1 style in Word to determine which parts of a Word document should be a top-level topic in Paligo.

For the import and mapping to work, it is important that you use the styles for formatting your MS Word content. You can create and apply styles from the Styles panel in Word:

Styles panel from Microsoft Word document. It has a current style field, a new style button, a select all button and a list of styles to choose from.

To work with Paligo, the styles need to have specific names:

  • Create a style called Title and apply it to the name of your document, for example, the title on the front cover. Make sure that the title is the first text that appears in the document.

    Note

    If you do not give your document a title, the publication will fail and get a name like "Article d4e1" when you import the document into Paligo.

  • If your Word document has a subtitle, create a Subtitle style and apply it to the subtitle text.

  • Make sure your headings use styles named Heading 1 through to Heading 6.

  • If you have code examples, create a paragraph style called Source Code and apply it to the code text. This has to be a paragraph style and not an inline style that is only applied to words inside a sentence.

The names of the styles are important for the Paligo import, so please make sure they match the names given above.

Tip

To find out how to set up styles, see the Microsoft documentation.

Paligo uses the hierarchy of heading styles to determine which parts of your document should be converted into top-level topics, second-level topics, third-level topics, and so on. For this to work correctly, it is important that you use the heading levels as intended and consistently in your MS Word doc.

  • Use Heading 1 for chapter headings

  • Use Heading 2 for first-level subsections in chapters

  • Use Heading 3 for second-level subsections in chapters

  • Use Heading 4 for third-level subsections in chapters

Always use them in this order and do not skip a heading level, for example, do not have a Heading 1 followed by a Heading 3. The sequence of Headings should always be Heading 1, Heading 2, Heading 3, and so on.

Typically, technical communication is organized into three or four levels of content for ease of use. If you have more levels, you can use Heading 5 and Heading 6. But if you need to use Headings 5 and 6, it could suggest a problem with the information architecture. It may be worth reconsidering the structure and reorganising it into 3 or 4 levels maximum.

Your MS Word document most likely contains some features that are needed for Word, but they are either not needed in Paligo or they cannot be imported. These include page breaks, a table of contents, and review comments.

  • Paligo cannot import text that is inside frames. If you have text in frames, you will need to move it into the regular content or it will be lost.

  • If there are any active review comments or track changes, make sure they are accepted before you import.

  • Remove the header and footer content

  • Remove all page breaks and section breaks

  • Remove the Word table of contents (TOC). Paligo will generate its own table of contents when you publish in Paligo.

Note

In some cases, graphics on the front cover can be a problem with importing. If you have front cover graphics, it can be a good idea to remove them and then add them to Paligo manually after the import.

In Microsoft Word, there are different ways to add content that is indented inside a list item (or is not indented, but really should be!). This can make it difficult to import in a consistent way.

For the cleanest import, we recommend that you format your indented list item content like this in MS Word:

  1. Create a new list item for the indented content.

  2. Add the text or image for the indented content in the new step.

  3. Position the cursor at the start of the new list item and press backspace. The list item formatting is removed so that the content is now indented as part of the previous step.

When you create indented content in MS Word in this way, Paligo will import the list and give it the expected structure:

correct-list-structure.png

Note that each list item has a listitem element and then the text for the list item is inside a para element. Here, we have imported a list that has an indented image for the third list item. The image uses a mediaobject element in Paligo and you can see that it is inside the listitem.

correct-list-structure-indented-image.png

For indented paragraphs, the paragraph is inserted in a para element inside the list item:

correct-list-structure-indented-para.png

You could have sublists indented in the same way. The important part is that the content is nested inside the listitem to which it relates.

Note

If you have content in a list and it was indented in a different way in MS Word, it may import as:

  • Regular paragraphs that break the flow of the list.

  • Literallayout elements, which are valid Paligo XML, but are not the recommended way to structure lists. In the editor, the list item can look like code, for example:

    literallayout-import.png

In both cases, you can use the Paligo editor to fix the issues. You can use the XML tree to indent the regular paragraphs and create new listitems for the literallayouts and move the content into those. Or you may prefer to address the problems in the MS Word document and re-import the content, so that the issues are fixed at the source.

Paligo can import tables from Microsoft Word, but for best results we recommend that you:

  • Avoid using large tables that cover multiple pages in MS Word.

    While large tables can import correctly, they may go beyond the boundaries of the Paligo editor, making them more difficult to edit (you will need to use the source code editor). It is better to try and break large tables down into smaller sets of tables. This makes them more usable in both Paligo and MS Word.

    When redesigning your tables, think about what information your users need from each table. Do they need to compare items? If yes, then can those related items be organized into smaller groups, perhaps smaller tables organized by product type or product name? If no, then think about making smaller tables for groups of related items rather than one oversized table.

  • Try to use simple tables where possible.

    Tables with merged cells will sometimes import as two separate columns, depending on how they are formatted in the background code in MS Word. The more complex the table, the greater the chance that you will need to manually edit the table in Paligo to get the result you want.

  • Use tables appropriately.

    Tables are a good option for presenting data, but are sometimes used for information that would work better as a regular paragraph. For example, if you have a cell with many paragraphs in it, then this could be a good candidate for being a section in its own right. The table cell would then just need a link to the appropriate section.

    If you intend on publishing to the web, it is also worth noting that tables of content can be more difficult to use on small devices such as smartphones. This is especially true of larger tables where vertical and horizontal scrolling is needed.