Configuration Files
The Importer uses the following configuration files:
application.yml
: This is the main configuration file.sophora-importer-<version>.conf
(optional): Used to set the JVM properties (JAVA_OPTS
) such as heap size.loader.properties
(optional): Used for adding the contents of a folder to the classpath of the Importer.logback-spring.xml
: Logging configuration.
The Importer needs to be restarted for changes in any of these configuration files to take effect.
Installation
The Importer is a Spring Boot app and expects to find the configuration files next to the jar file, so we recommend to put the Importer application jar and the configuration files into the same directory. The name of the sophora-importer.conf file must exactly match the name of the jar file, i.e., if the name of the jar file includes the version number, the name of the conf file must include the version number as well.
Recommended directory structure:
/cms
/sophora-importer
/additionalLibs
/groovy
/logs
application.yml
sophora-importer-4.0.0.conf
sophora-importer-4.0.0.jar
loader.properties
logback-spring.xml
Our Maven repository contains two files suitable for deploying the Importer:
com.subshell.sophora.importer-<VERSION>-executable.jar
com.subshell.sophora.importer-4.0.0-SNAPSHOT-bin.tar.gz
The executable jar is basically the Importer application without any configuration files. It is suitable when deploying the Importer using Ansible, Puppet or similar configuration management tools. The bin.tar.gz contains the executable jar as well as sample configuration files. Use this for manual deployments to get started quickly.
The "xsl" folder from the bin.tar.gz
contains exemplary XSL files. These are examples for transforming XML files into Sophora XML using XSL (see section XSL Transformation Before Importing).
Starting and Stopping
The Importer jar file is a Spring Boot executable jar. On Linux and MacOS, the Importer can be started by running the executable jar as follows:
./sophora-importer-<version>.jar run
For running the Importer as a background daemon, use the options start
and stop
.
More options are documented in the Spring Boot documentation.
If required configuration files lack some required configuration options, the Importer cannot be started. The log file then contains information on why the startup has been aborted.
Adding jar files to the classpath
Jar files with additional classes for preprocessing etc. can be added to the classpath with the following entry in the file loader.properties
:
# Loads resources (.class files etc.) from nested jar files in directories.
# Should contain comma-separated list of directories, archives, or directories within archives
# (e.g. lib,${HOME}/app/lib, earlier entries take precedence).
loader.path=additionalLibs
JAVA_OPTS
Options for the Java VM, such as heap size, can be set using the environment variable JAVA_OPTS
or using an entry in the sophora-importer.conf
file. For example:
JAVA_OPTS="-Xmx1G"
Management-Endpoints / Actuators
The Importer exposes a few HTTP endpoints for management and metrics. These are available at the same HTTP port as the SOAP web service. Access to the management endpoints is also using the authentication settings for the web service. Notable endpoints are:
- /actuator/health
- /actuator/jolokia
- /actuator/prometheus
- /actuator/sophora-server
Configuration using the application.yml
Within the importer process one or more importer instances run. Some configuration options are specific for each instance. For example, each instance has its own watch folder. So you can, for example, run an importer process with one instance responsible for video imports, one instance responsible for image imports from an image database and another instance responsible for live ticker imports.
Options for the connection to the Sophora Server
Property | Description | Default |
---|---|---|
sophora.client.server-connection.url | Deprecated: As of importer 4.1.2, use sophora.client.server-connection.urls instead. The single address (RMI or HTTP) to connect with (e.g. http://demo.de:1196). Note: There is one connection which is shared among all importer instances of the importer process. | |
sophora.client.server-connection.urls | A list of addresses (RMI or HTTP) to connect with e.g.: urls: Note: There is one connection which is shared among all importer instances of the importer process. | |
sophora.client.server-connection.username | Username to access Sophora's content manager. | |
sophora.client.server-connection.password | Password to access Sophora's content manager. | |
sophora.client.cache.document-cache-elements-in-memory | The size of the document cache. If you apply a transformation or a preprocessor that frequently accesses different existing documents from the Sophora server, you may want to increase this cache. Consider the increased memory footprint and assign more memory to the importer if necessary. | 1000 |
sophora.client.cache.published-document-cache-elements-in-memory | The size of the published document cache. Similar to sophora.client.cache.document-cache-elements-in-memory, except that this value only considers the published versions of documents. If you retrieve the published version of documents in a transformation or a preprocessor, this is the value you may want to adjust. | 100 |
sophora.client.misc.data-dir | Defines a directory which may be used by the Sophora Client Api for persisting information like the available nodes in a cluster. The directory must be specified over an absolute path. Default is the working directory of the importer. | |
sophora.client.proxy.host | URL of the used proxy host, e.g. http://www.proxy.org. | |
sophora.client.proxy.password | Password to access the used proxy. | |
sophora.client.proxy.port | Port of the used proxy host, e.g. 8181. | |
sophora.client.proxy.username | Username to access the used proxy. | |
sophora.client.server-connection.retries | If a connection to the Sophora server is not possible try again a few times. | 3 |
sophora.client.server-connection.retry-interval | The time in seconds to wait between connection attempts. | 10 |
sophora.client.server-connection.use-migration-mode | Should only be used in rare circumstances. Use this only if you know what you are doing! Enables the migration mode when accessing the repository. When migration mode is switched on it is possible to set these system properties that normally cannot be set:
sophora:modificationDate , sophora:modifiedBy , sophora:publicationDate , sophora:firstPublicationDate and sophora:publishedBy can be controlled when a new version of the document is made (e.g. the document is published via a "publish" instruction). | false |
Options for the embedded HTTP server
Property | Description | Default |
---|---|---|
server.port | HTTP port of the web server for the SOAP web service and management endpoints (e.g. health). | 8081 |
server.address | Interface address to bind to. | 0.0.0.0 |
Global options
Some of these options can be overridden for each instance.
Property | Description | Default |
---|---|---|
importer.cleanupFoldersCron | A cron expression that specifies when the "successful" and "failure" folders will be cleaned up. The expression uses the format of the Quartz CronTrigger. See also Automatically deleting old files. Leave empty to disable cleanup. | |
importer.cleanupFoldersFailureMaxAge | When cleaning up the failure folder of an instance, files in the folder must be at least this many days old to be deleted. Set to 0 to disable deletion for this folder. | 0 |
importer.cleanupFoldersSuccessfulMaxAge | When cleaning up the success folder of an instance, files in the folder must be at least this many days old to be deleted. Set to 0 to disable deletion for this folder. | 0 |
importer.disabled | This is for test purposes only (e.g. if you want to check the XSL transformation). If set to true, import transformations will run but no documents will be created or modified in the Sophora server. | false |
importer.feedPollingEnabled | Set to true for polling configured feeds. | false |
importer.filenamesAddTimestamp | Determines whether a timestamp is attached to the names of the files that are imported and to the names of the temporary files. | true |
importer.folders.failure | Target directory to move the XML files to, if the import process failed. This property allows to use patterns within the given path. For supported patterns see the success folder option. | |
importer.folders.feedPollingData | Directory for saving data regarding the polling of feeds (e.g. last processed feed item per feed). If feed polling is active and no directory is given, a folder named feedpolling is created in the working directory of the importer. | |
importer.folders.fileAccessBase | This optional property determines a directory which can be additionally accessed (recursively) during the import process. That means on one hand that you can use references to binary files in the sophora xml document which point to files within this folder (or its subfolders etc.). On the other hand it allows the webservice to read files in the specified directory (or its subfolders etc.). When the property is not configured, the webservice is not allowed to access any local files. It affects the possible URIs in the importXmlByReference* methods. | |
importer.folders.success | Target directory to move the XML files to, if the import process finished successfully. This property allows to use patterns within the given path in the form of ${pattern} . Supported patterns:${date;<DateFormat>} - the date of the import in the given format. For supported date formats see https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.${<xslParameter>} - XSL parameter keys given in feed imports or imports by web service. | |
importer.folders.watchCheckInterval | Interval (in milliseconds) to check the import directory (watch folder). | 10000 |
importer.folders.watchFilesRegex | This regular expression determines which files in the watch folder are processed by the importer. The default value is a regular expression matching all file names which end with '.xml', but do not end with '.config.xml' or '.bin.xml'. If you change this regular expression be careful that files ending with '.config.xml' and '.bin.xml' are still ignored because these file endings are produced when sophora documents including xml binary data or node type configurations are exported. Hint: The regular expression, which is used for a watch folder instance, is printed out in the log file when the importer is started. | (?i).+(?<!\\.config)(?<!\\.bin)\\.xml |
importer.folders.watchRecursive | If set to true, all subfolders (and their subfolders etc.) of the watch folderr are included when watching for incoming Sophora-XML files. If this paramter is set to true, make sure that other folders like the success and failure folders are not configured as subfolders of the watch folder. The importer imports document in lexicographical order based on the relative paths of the documents; i.e. an incoming file subfolder-A/import.xml is handled before a file subfolder-B/import.xml. | false |
importer.folders.xsl | The directory where the XSL files are located for the importer (see section XSL Transformation Before Importing). This property may only be omitted if XSL transformations are disabled. | |
importer.httpProxyHost | A proxy configuration is needed if the importer operates behind a proxy and the Import XML is passed to the webservice as a remote URL, the Import XML refers to binary files via http or https or feeds are imported. Example: http://www.proxy.org. | |
importer.httpProxyLogin | Optional username to access the used proxy. | |
importer.httpProxyPassword | Password to access the used proxy. If a username is set a password has to be set as well. | |
importer.httpProxyPort | Port of the used proxy host, e.g. 8181. | |
importer.httpSoTimeout | Timeout in milliseconds when accessing feeds or making other http requests (for example downloading images). | 30000 |
importer.jmxLogin | Username for the JMX connection. | |
importer.jmxPassword | Password for the JMX connection. | |
importer.keepTempFiles | Determines whether to keep the temporary files after the Importer finishes. If the value is true, these files are moved to the success or failure directory together with the XML files. | |
importer.maximumImportsToKeep | Number of import results to keep in memory for JMX. | |
importer.minimumFailedImportsToKeep | The minimum number of failed import results to keep in memory for JMX. | |
importer.name | The Importer's name to be used for JMX, logging, and matching feed configurations. | |
importer.preprocessing.className | Defines the class which implements the IPreProcession interface. | |
importer.preprocessing.scriptFolder | Folder containing groovy preprocessing scripts. Can be left blank if the preprocessor class is on the classpath or preprocessing is not used. Using precompiled groovy class files from "additionalLibs" folder is recommended for best performance, as otherwise scripts are recompiled for each import. | |
importer.rmiRegistryPort | RMI port for external MBean requests. | 6001 |
importer.rmiServicePort | Internal RMI port for the JMX communication. | 6000 |
importer.springAdditionalBasePackages | Additional java packages which the Importer should scan for Spring component classes. By using this property and putting client specific jars in the classpath of the importer you can use Spring functionality in your project specific code. If you want to specify more than one package you can do this by separating different packages with commas. e.g. "com.subshell.sophora.sport.imports,com.subshell.sophora.other.imports" | |
importer.transform | Defines the XSL transformation mode for this importer instance. The following values are valid: transformIfNotSophoraXml: An XSL transformation will be performed, if the input XML file does not contain valid Sophora-XML. forceTransform: Always apply an XSL transformation before importing (independent from the validity of the source XML file). skipTransform: Never execute an XSL transformation. The default ist transformIfNotSophoraXml. | |
importer.validateDocuments | Defines whether documents are validated or not. By default documents are validated. The validation should at most be disabled in very special situations (and should be activated afterwards!) - this might, for example, be a migration scenario from Sophora to Sophora where you have to migrate documents which lack a recently added mandatory property. When the validation is disabled (value false) invalid documents can be saved: You can save documents with missing mandatory properties and with property values that don't match the according validation expression - furthermore, results of validation scripts are ignored. | true |
importer.webService.authenticationRequired | Enables basic authentication for the SOAP webservice interface. Possible values are true and false. | false |
importer.webService.defaultInstance | Sets the default instance for the SOAP webservice. This property is used by all webservice methods which do contain 'ToInstance' in its method name. | First instance |
importer.webService.enabled | Enables or disables the SOAP webservice interface. | false |
importer.webService.logins | Map of usernames to passwords for authentication. | |
importer.xslTransformerFactory | The classname of the XSL transformer factory, which is used for XSL transformations (see section Using a Custom XSL Transformer). The default value is org.apache.xalan.xsltc.trax.TransformerFactoryImpl. |
Options for each instance
Each importer instance has a set of configuration options. For some options, it is possible to set defaults in the global options, which can then be overridden for a particular instance.
Instances are configured like this in the application.yml
:
importer:
instances:
- name: Common Imports
key: common
folders:
watch: /cms/data/import/incoming
...
- name: Image importer
key: images
...
Property | Description | Default |
---|---|---|
cleanupFoldersFailureMaxAge | When cleaning up the failure folder of the instance, files in the folder must be at least this many days old to be deleted. Set to 0 to disable deletion for this instance / folder. | |
cleanupFoldersSuccessfulMaxAge | When cleaning up the success folder of the instance, files in the folder must be at least this many days old to be deleted. Set to 0 to disable deletion for this instance / folder. | |
defaultSite | The site to import the documents to. This parameter is only considered, if the XML neither contains an empty <site> nor empty <structureNode> tag and the import operation is not an update of an existing document. | |
defaultStructureNode | The structure node to import the documents to. This parameter is only considered, if the <structureNode> element in the XML is empty and the import operation is not an update of an existing document. | |
disabled | This is for test purposes only (e.g. if you want to check the XSL transformation). If set to true, import transformations will run but no documents will be created or modified in the Sophora server. | false |
filenamesAddTimestamp | Determines whether a timestamp is attached to the names of the files that are imported and to the names of the temporary files. | true |
folders.failure | Target directory to move the XML files to, if the import process failed. This property allows to use patterns within the given path. For supported patterns see the success folder option. | |
folders.fileAccessBase | This optional property determines a directory which can be additionally accessed (recursively) during the import process. That means on one hand that you can use references to binary files in the sophora xml document which point to files within this folder (or its subfolders etc.). On the other hand it allows the webservice to read files in the specified directory (or its subfolders etc.). When the property is not configured, the webservice is not allowed to access any local files. It affects the possible URIs in the importXmlByReference* methods. | |
folders.success | Target directory to move the XML files to, if the import process finished successfully. This property allows to use patterns within the given path in the form of ${pattern} . Supported patterns:${date;<DateFormat>} - the date of the import in the given format. For supported date formats see https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.${<xslParameter>} - XSL parameter keys given in feed imports or imports by web service. | |
folders.temp | Directory to save temporary files in, which the importer instance produces. | |
folders.watch | The import directory that the importer instance monitors. Only files matching the watchFilesRegex will be processed. | |
folders.watchCheckInterval | Interval (in milliseconds) to check the import directory (watch folder). | 10000 |
folders.watchFilesRegex | This regular expression determines which files in the watch folder are processed by the importer. The default value is a regular expression matching all file names which end with '.xml', but do not end with '.config.xml' or '.bin.xml'. If you change this regular expression be careful that files ending with '.config.xml' and '.bin.xml' are still ignored because these file endings are produced when sophora documents including xml binary data or node type configurations are exported. Hint: The regular expression, which is used for a watch folder instance, is printed out in the log file when the importer is started. | (?i).+(?<!\\.config)(?<!\\.bin)\\.xml |
folders.watchRecursive | If set to true, all subfolders (and their subfolders etc.) of the watch folderr are included when watching for incoming Sophora-XML files. If this paramter is set to true, make sure that other folders like the success and failure folders are not configured as subfolders of the watch folder. The importer imports document in lexicographical order based on the relative paths of the documents; i.e. an incoming file subfolder-A/import.xml is handled before a file subfolder-B/import.xml. | |
folders.xsl | The directory where the XSL files are located for the importer (see section XSL Transformation Before Importing). This property may only be omitted if XSL transformations are disabled. | |
key | The key or id of this instance. The key is used by SOAP import requests to select an importer instance. It is also used for mapping a feed import configuration to an instance. | |
maximumImportsToKeep | Number of import results to keep in memory for JMX. | |
minimumFailedImportsToKeep | The minimum number of failed import results to keep in memory for JMX. | |
name | The name of the particular importer instance to be used for JMX and logging purposes. | |
preprocessing.className | Defines the class which implements the IPreProcession interface. | |
preprocessing.scriptFolder | Folder containing groovy preprocessing scripts. Can be left blank if the preprocessor class is on the classpath or preprocessing is not used. Using precompiled groovy class files from "additionalLibs" folder is recommended for best performance, as otherwise scripts are recompiled for each import. | |
transform | Defines the XSL transformation mode for this importer instance. The following values are valid: transformIfNotSophoraXml: An XSL transformation will be performed, if the input XML file does not contain valid Sophora-XML. forceTransform: Always apply an XSL transformation before importing (independent from the validity of the source XML file). skipTransform: Never execute an XSL transformation. The default ist transformIfNotSophoraXml. | |
validateDocuments | Defines whether documents are validated or not. By default documents are validated. The validation should at most be disabled in very special situations (and should be activated afterwards!) - this might, for example, be a migration scenario from Sophora to Sophora where you have to migrate documents which lack a recently added mandatory property. When the validation is disabled (value false) invalid documents can be saved: You can save documents with missing mandatory properties and with property values that don't match the according validation expression - furthermore, results of validation scripts are ignored. | true |
webServiceEnabled | Enables or disables the SOAP webservice interface for this instance. | |
xslTransformerFactory | The classname of the XSL transformer factory, which is used for XSL transformations (see section Using a Custom XSL Transformer). The default value is org.apache.xalan.xsltc.trax.TransformerFactoryImpl. |
Example application.yml
:
sophora:
client:
server-connection:
urls:
- http://localhost:1196
username: alice
password: secret
retries: 100
retry-interval: 10
importer:
name: Demo-Importer
# JMX
rmiServicePort: 5000
rmiRegistryPort: 5001
jmxLogin: importerjmx
jmxPassword: password
# Defaults for all instances.
folders:
watchCheckInterval: 1000
watchRecursive: true
webService:
enabled: true
authenticationRequired: true
defaultInstance: common
logins:
admin: xxx
filenamesAddTimestamp: false
cleanupFoldersCron: "0 0 9 ? * * *"
cleanupFoldersSuccessfulMaxAge: 90
cleanupFoldersFailureMaxAge: 90
# Configuration of the importer instances
instances:
- name: Common Imports
key: common
transform: skipTransform
folders:
watch: /cms/data/import/incoming
temp: /cms/data/import/temp
success: /cms/data/import/success
failure: /cms/data/import/failure
xsl: /cms/sophora-importer/xsl/
defaultStructureNode: /import
server:
port: 8081
address: 0.0.0.0
logback-spring.xml (optional)
This optional configuration file defines the Importer's logging behaviour. The Importer does not need to be restarted for changes in the file to take effect, if the root attribute scanPeriod
is set in the logging configuration file.
Introductory information about logback and its configuration can be found here: http://logback.qos.ch/
If you like to enable separate logging for each importer instance, you can use this exemplary configuration file and remove the comments at the two marked locations. If you do so, the importer will create the following log files:
sophora-importer.log
: the default log file with all information. This can be disabled by removing the "FILE" appender.sophora-importer_instance-main.log
: the log file which contains all information, that can't be assigned to one specific instance.sophora-importer_instance-<number>.log
: One log file for each configured instance, only showing instance specific information.
Exemple logback-spring.xml
file:
<?xml version="1.0" encoding="UTF-8"?>
<!-- For more information on logback logging see: http://logback.qos.ch/manual/index.html -->
<configuration scan="true" scanPeriod="10 seconds">
<jmxConfigurator/>
<!-- Name to be shown in the subject of email notifications. -->
<property name="IMPORTER_NAME" value="Test-Importer" />
<!-- Logging-Event-Class ("ERROR", "INFO" etc.) for email subjects. -->
<property name="LOGGING_EVENT_CLASS" value="%-5p" />
<!-- Logging pattern: 'importerInstanceName' and 'sourceFileName' are references to MDC properties in the importer code. -->
<property name="APPENDER_PATTERN"
value="%d{dd.MM.yyyy HH:mm:ss} %5level [%12.12thread] [%X{importerInstanceName}: %X{feedName} %X{sourceFileName}] %.40(%logger{0}:%L) --- %msg%n%ex"/>
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<filter class="ch.qos.logback.core.filter.EvaluatorFilter">
<evaluator>
<!-- No log messages marked as 'SPECIAL_EMAIL_NOTIFICATION' should be shown. -->
<expression>marker != null && marker.getName().equals("SPECIAL_EMAIL_NOTIFICATION")</expression>
</evaluator>
<OnMismatch>NEUTRAL</OnMismatch>
<OnMatch>DENY</OnMatch>
</filter>
<encoder>
<pattern>${APPENDER_PATTERN}</pattern>
</encoder>
</appender>
<!-- Separate log files for each instance -->
<appender name="INSTANCES" class="ch.qos.logback.classic.sift.SiftingAppender">
<discriminator>
<key>importerInstanceKey</key>
<defaultValue>main</defaultValue>
</discriminator>
<sift>
<appender name="INSTANCE-${importerInstanceKey}" class="ch.qos.logback.core.FileAppender">
<filter class="ch.qos.logback.core.filter.EvaluatorFilter">
<evaluator>
<expression>marker != null && marker.getName().equals("SPECIAL_EMAIL_NOTIFICATION")</expression>
</evaluator>
<OnMismatch>NEUTRAL</OnMismatch>
<OnMatch>DENY</OnMatch>
</filter>
<File>logs/sophora-importer-instance-${importerInstanceKey}.log</File>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>logs/sophora-importer-instance-${importerInstanceKey}.%d{yyyy-MM-dd}.log</fileNamePattern>
<maxHistory>7</maxHistory>
</rollingPolicy>
<encoder>
<pattern>${APPENDER_PATTERN}</pattern>
</encoder>
</appender>
</sift>
</appender>
<appender name="LOGFILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<filter class="ch.qos.logback.core.filter.EvaluatorFilter">
<evaluator>
<!-- No log messages marked as 'SPECIAL_EMAIL_NOTIFICATION' should be shown. -->
<expression>marker != null && marker.getName().equals("SPECIAL_EMAIL_NOTIFICATION")</expression>
</evaluator>
<OnMismatch>NEUTRAL</OnMismatch>
<OnMatch>DENY</OnMatch>
</filter>
<File>logs/sophora-importer.log</File>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>logs/sophora-importer.%d{yyyy-MM-dd}.log</fileNamePattern>
<maxHistory>7</maxHistory>
</rollingPolicy>
<encoder>
<pattern>${APPENDER_PATTERN}</pattern>
</encoder>
</appender>
<!-- For more information on logback email logging see: http://logback.qos.ch/manual/appenders.html#SMTPAppender -->
<appender name="EMAIL" class="ch.qos.logback.classic.net.SMTPAppender">
<evaluator class="ch.qos.logback.classic.boolex.OnMarkerEvaluator">
<marker>EMAIL_NOTIFICATION</marker>
<marker>SPECIAL_EMAIL_NOTIFICATION</marker>
</evaluator>
<SMTPHost>smtp.host.de</SMTPHost>
<Username>xxx@yourmail.com</Username>
<Password>your_password</Password>
<To>importererror@yourcompany.com</To>
<From>importer@yourcompany.com</From>
<Subject>${LOGGING_EVENT_CLASS}: Importer '${IMPORTER_NAME}', Instanz '%X{importerInstanceName}'</Subject>
<layout class="ch.qos.logback.classic.PatternLayout">
<Pattern>${APPENDER_PATTERN}</Pattern>
</layout>
<CyclicBufferTracker class="ch.qos.logback.core.spi.CyclicBufferTracker">
<!-- Send just one log entry per email. -->
<BufferSize>1</BufferSize>
</CyclicBufferTracker>
<!-- Encoding of the email. -->
<CharsetEncoding>ISO-8859-1</CharsetEncoding>
</appender>
<logger name="com.subshell.sophora" level="INFO"/>
<logger name="org.springframework.boot" level="INFO"/>
<root level="WARN">
<appender-ref ref="LOGFILE" />
<!-- Remove comment if you want log to the console.
<appender-ref ref="STDOUT" />
-->
<!-- Remove comment if you want to have separate log files for each importer instance.
<appender-ref ref="INSTANCES" />
-->
<!-- Remove comment if you want to have email notifications on particular importer errors.
<appender-ref ref="EMAIL" />
-->
</root>
</configuration>
Binary property names for old versions of Sophora XML (optional)
Binary content within Sophora is modeled as a special property which needs to be treated separately in Sophora XML: First, you have to declare properties as binary explicitly. Next, there must be a mapping of the properties' names and the according mimetypes. This assignment is done in the application.yml file as shown in the following example:
# Provides a default mapping between binary property name and mimetype name.
# This is needed by the importer to successfully import documents with Sophora-XML <= 1.6 if custom binary properties are used.
importer:
binaryProperties:
'sophora:binarydata': 'sophora:mimetype'
'sophora-extension:binarydata': 'sophora:mimetype'
'core:binarydata': 'core:mimetype'
Each element in the map assigns a binary property to a mimetype property.
Properties in the Sophora XML that match one of the keys in this map are interpreted as binary. There must be another property on the same level of the Sophora XML that matches the value of the binary properties map. For example:
<childNodes>
<childNode nodeType="sophora-extension-nt:imagedata" name="sophora-extension:imagedata">
<properties>
<property name="sophora-extension:binarydata">
<value>olympiapeking_crowd.jpg</value>
</property>
<property name="sophora:mimetype">
<value>image/jpeg</value>
</property>
</properties>
<childNodes/>
<resourceList/>
</childNode>
</childNodes>
If the binary properties options are not configured, a default configuration will be applied. If a binary property is not covered by the standard configuration, it will be imported nonetheless. Its mimetype is then identified with the help of the file ending. In that case, the Importer will write a warning to the log saying that a binary property may not have been imported correctly.