Importer 3

Importer: XSL Transformation Before Importing

A XSL transformation can be carried out before Sophoras Importer starts the import process.

If a XML file is copied to the watchfolder of an importer instance or if the XML file is handed to the webservice, first of all the Importer checks, if the XML is valid Sophora-XML (by checking the XML against the Sophora XML Schema).

If the XML file contains valid Sophora XML, the actual import process begins. If, on the other hand, the file does not contain valid Sophora XML or if the property sophora.importer.transformationMode has the value forceTransform, the Importer tries to generate valid Sophora XML by transforming the XML via a XSL transformation. In this way it is possible to pass any XML to the Importer - you just have to make sure, that the XSL transformation produces valid Sophora XML.

If sophora.importer.transformationMode does not have the value skipTransform, the property sophora.importer.directory.xsl must exist and has to point to an existing folder in the filesystem. This folder determines where the XSLT files are located and which are used for the XSL transformation. The entry point is the XSLT file transform.xsl which must exist in the XSL folder.

If the file transform.xsl or one of the via <xsl:include> included XSL templates files are changed, the templates are reloaded automatically. Therefore, it is not necessary to restart the Importer.

Predefined XSL Template Parameters

There are a few predefined, in some situations helpful, XSL template parameters which are set during the transformation process and thus can be used in the XSL templates.

VariableDescription
currentTimeThis variable holds the current time, i.e. when the template was called by the transformer. The format of the time string is ISO 8601 - 2010-08-05T09:00:00+02:00 for instance.
firstExecutionTimeThis variable holds the time of the first import attempt. In most cases, this is equal to currentTime. However, if a document is locked and the import is set to ask for the lock (e.g., <forceLock timeout="10">, "polite import"), the import will be deferred. In such a case, this variable will stay the same on subsequent import attempts. The format of the time string is ISO 8601.
importWatchfolderThis variable holds the watchfolder directory of the importer instance - e.g. file:/data/sophora/cms/importer/common/incoming.
xmlFileNameThis variable holds the name of the XML file which is transformed.
xmlFileFolderThis variable holds the directory path of the XML file which is transformed. This variable is useful, if you want to generate additional files during the transformation process - for an example see the select value example in the next section.

To access these variables you only have to declare the parameters in the head of the XSL template as is shown in the following example code.

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="1.0">
 
    <!-- Declaring the predefined parameters . -->
    <xsl:param name="currentTime" />
    <xsl:param name="importWatchfolder" />
    <xsl:param name="xmlFileName" />
    <xsl:param name="xmlFileFolder" />
 
    [...]
 
    <xsl:template match="meta">
        <property name="sophora-content:importTime">
            <!-- Accessing the value of the current time variable.  -->
            <value><xsl:value-of select="$currentTime" /></value>
        </property>
    </xsl:template>
 
</xsl:stylesheet>

Using a Custom XSL Transformer

The Sophora property sophora.importer.xslTransformerFactory allows to define which transformer factory is used for performing the XSL transformations. The built-in and default transformer factory is org.apache.xalan.xsltc.trax.TransformerFactoryImpl. By setting another transformer factory it is - for example - possible to make XSLT 2.0 transformations. (The built-in Xalan transformer only supports XSLT 1.0 transformations!)

However, it is not sufficient just to alter the sophora.importer.xslTransformerFactory property's value. Additionally, you have to put the JAR file which contains the implementation for the declared transformer factory class to the lib folder of the Importer.

For instance, if you want to use the Saxon XSL transformer, you have to set the property sophora.importer.xslTransformerFactory to the value net.sf.saxon.TransformerFactoryImpl. Furthermore, you have to download the appropriate JAR file from the saxon project website and put it into the Importer's lib directory.

The following example shows how to create a Sophora import XML file and an additional select value XML file during a transformation by using a XSLT 2.0 template.

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0">
 
    <!-- Predefined import file folder parameter (see previous section). -->
    <xsl:param name="xmlFileFolder" />
 
    <xsl:output indent="yes" />
 
    <xsl:template match="/">
 
        <!-- This XSLT 2.0 element makes it possible to create an additional XML file. In this case the XML selectvalue file is created and put in the the directory
             in which the import xml file is located too. It is important to use another file suffix than "xml" - otherwise it would be interpreted as XML import file
             and would be transformed. -->
        <xsl:result-document href="{$xmlFileFolder}selectvalue_test_xml.txt">
            <selectValues xmlns="http://www.sophoracms.com/config/selectValues/1.2"
                          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                          xsi:schemaLocation="http://www.sophoracms.com/config/selectValues/1.2 http://www.sophoracms.com/config/selectValues/selectValues-1.2.xsd">
                <defaults>
                    <xsl:apply-templates select="values/value" />
                </defaults>
            </selectValues>
        </xsl:result-document>
 
        <document xmlns="http://www.sophoracms.com/import/2.8" nodeType="sophora-nt:selectValues" externalID="selectvalue_test">
            <properties>
                <property name="sophora:name">
                    <value>Selectvalue-Test</value>
                </property>
                <property name="sophora:binarydata" mimetype="text/xml">
                    <!-- The "reference" to the selectvalue file generated above. -->
                    <value>selectvalue_test_xml.txt</value>
                </property>
            </properties>
            <childNodes />
            <resourceList />
            <fields>
                <site>system</site>
                <structureNode>/</structureNode>
                <categories />
                <idstem>selectvaluetest</idstem>
                <forceLock>true</forceLock>
                <forceCreate>false</forceCreate>
                <enabledChannels />
                <disabledChannels />
            </fields>
            <instructions>
                <imageActivities />
                <lifecycleActivities>
                    <lifecycleActivity type="publish" />
                </lifecycleActivities>
                <proposals />
            </instructions>
        </document>
    </xsl:template>
 
    <xsl:template match="value">
        <selectValue xmlns="http://www.sophoracms.com/config/selectValues/1.2" value="{@key}" label="{@value}" />
    </xsl:template>
 
</xsl:stylesheet>

Calling java operations from within XSL templates

In two situations it is very helpful to call java operations from within XSL templates:

  1. You want to execute algorithms which are difficult to implement in XSLT.
  2. You want to gain information from the repository because these information have impact on the import process. Therefore you have to call java operations from your XSL template because you can't access the sophora client directly from your XSL code.

How java operations are called from within XSL templates and how the sophora client is accessed, is explained in the following example. The example is from a migration context: While migrating documents to a sophora server, editors may modify documents which have been created by the importer. It is now a requirement from the customer that the importer must not touch such editor-modified documents if he imports those document again. In such a case the import must skip the XML.

First of all you need java code to provide an operation which checks if a document may be imported by the importer or not. The relevant operation is called isCreationOrUpdateAllowed. You can get the sophora client directly from the importer's SophoraClientWrapper via SophoraClientWrapper.getSophoraClient().

XslJavaCommunication.java

package com.subshell.sophora.examples.importer.xsl;
 
import com.subshell.sophora.api.IReference;
 
[...]
 
public class XslJavaCommunication  {
 
  /**
   * Returns 'true' if...
   * - the given external id is null or empty OR
   * - no document with the given external id exists OR
   * - a document with the given external id exists and the document's last modifier is the migration importer itself.
   */
  public static boolean isCreationOrUpdateAllowed(String externalId) {
    // An empty given external id is interpreted as creating
    // a new document.
    if (StringUtils.isBlank(externalId)) {
      return true;
    }
 
    try {
      // Getting the sophora client:
      ISophoraClient sophoraClient = SophoraClientWrapper.getSophoraClient();
      INode document = sophoraClient.getDocumentByExternalId(externalId);
 
      // The migration importer (its username is "migrationsimporter") is the last modifier itself:
      if (document.getProperty(SophoraConstants.SOPHORA_MODIFIED_BY).getString().equals("migrationsimporter")) {
        return true;
      } else { // The document was modified by an editor (or any other user).
        return false;
      }
 
    } catch (ItemNotFoundException e) {
      // Document does not exist and therefore will be created:
      return true;
    } catch (Exception e) {
      log.error("Problem when executing java code from within XSL template.", e);
      return false;
    }
  }
 
}

But how is the java operation called in your XSL template? That depends on the XSL transformer which is used by the importer (see previous section Using a Custom XSL Transformer).

If the Xalan transformer is used (default transformer) java operations can be called as is shown in the next example:

Calling java operations from within XSL (Xalan)

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:sophora="http://www.sophoracms.com/import/2.8" xmlns="http://www.sophoracms.com/import/2.8" version="1.0">
 
  <xsl:output indent="yes" />
 
  <!-- Copy all. -->
  <xsl:template match="@* | node()" >
    <xsl:copy>
      <xsl:apply-templates select=" @* | node() " />
    </xsl:copy>
  </xsl:template>
 
  <!-- Handling document elements: They are only copied if they have not modified by an editor. -->
  <xsl:template match="sophora:document">
    <xsl:variable name="isCreationOrUpdateAllowed">
      <!-- Calling the java operation "isCreationOrUpdateAllowed" from the above java class (Xalan). -->
      <xsl:value-of select="javaHelper:isCreationOrUpdateAllowed(@externalID)" xmlns:javaHelper="xalan://com.subshell.sophora.examples.importer.xsl.XslJavaCommunication" />
    </xsl:variable>
 
    <!-- If the above call returns 'true', the <document>-element and its content is copied. If it returns 'false', the
         <document>-element is ignored. -->
    <xsl:choose>
      <xsl:when test="$isCreationOrUpdateAllowed = 'true'">
        <xsl:copy>
          <xsl:apply-templates select=" @* | node() " />
        </xsl:copy>
      </xsl:when>
      <xsl:otherwise>
        <xsl:comment>Document '<xsl:value-of select="@externalID" />' was not imported because it was modified by an editor.</xsl:comment>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
 
</xsl:stylesheet>

If you use the Saxon transformer the java operation call is slightly different:

Calling java operations from within XSL (Saxon)

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:sophora="http://www.sophoracms.com/import/2.8" xmlns="http://www.sophoracms.com/import/2.8" version="2.0">
  [...]
    <xsl:variable name="isCreationOrUpdateAllowed">
      <!-- Calling the java operation "isCreationOrUpdateAllowed" from the above java class (Saxon). -->
      <xsl:value-of select="javaHelper:isCreationOrUpdateAllowed(@externalID)" xmlns:javaHelper="java:com.subshell.sophora.examples.importer.xsl.XslJavaCommunication" />
    </xsl:variable>
  [...]
</xsl:stylesheet>

Built-in java functions

There are a few java functions which are available in XSLT templates automatically.

FunctionDescription
getDocumentAsSophoraXml(String externalId, String xmlVersion)Returns the Sophora XML of the document with the given external id. (The Sophora XML is presented in the given Sophora XML version.)
getDocumentAsSophoraXml(String externalId, String xmlVersion, int maxLevel)Returns the Sophora XML of the document with the given external id and recursively referenced documents (controllable with the parameter maxLevel). (The Sophora XML is presented in the given Sophora XML version.)
logEMailWarnMessage(String warnText)Writes the given text as warn message to the log file ad additionally sets the email marker EMAIL_NOTIFICATION, with the result that a warn email is sent (if configured in the logback configuration file - for more information and an example logback file see the chapter "Importer: Installation & Configuration").
This operation is useful if you want to inform a user or a group of users (i.e. the deliverer of the XML) but nevertheless the import process should continue.
logEMailErrorMessage(String errorText)Writes the given text as error message to the log file ad additionally sets the email marker EMAIL_NOTIFICATION, with the result that an error email is sent (if configured in the logback configuration file - for more information and an example logback file see the chapter "Importer: Installation & Configuration").
This operation is useful if you want to inform a user or a group of users (i.e. the deliverer of the XML) but nevertheless the import process should continue.

The following code snippet shows a fragment of a XSLT transformation (using the saxon transformer) who uses the function getDocumentAsSophoraXml for checking if an existing document has certain features:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns="http://www.sophoracms.com/import/3.2"
                xmlns:soph="http://www.sophoracms.com/import/3.2"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                version="2.0">

    <xsl:template match="liveticker" >
            
        <!-- ... -->
        
        <resourceList>
            <xsl:variable name="referencingMatchRound"
                          select="importUtils:getDocumentAsSophoraXml($externalIdMatchround, '3.2')"
                          xmlns:importUtils="java:com.subshell.sophora.importer.ImportUtils" />
            
            <xsl:choose>
                <!-- No document found. -->
                <xsl:when test="not($referencingMatchRound)"> 
                    <!-- ... -->
                </xsl:when>
                <xsl:otherwise> 
                    <xsl:variable
                        name="matchInReferencingMatchRound"
                        select="$referencingMatchRound//soph:documents/soph:document/soph:childNodes/soph:childNode
                            [soph:properties/soph:property[@name = 'sophora-sport:matchId']/soph:value = $matchId]" />
                    
                    <!-- ... -->
                      
                </xsl:otherwise>
            </xsl:choose>
        </resourceList>   
    </xsl:template>
    
</xsl:stylesheet>

Last modified on 7/26/19

The content of this page is licensed under the CC BY 4.0 License. Code samples are licensed under the MIT License.

Icon