The Sophora Repo Exporter is a standalone tool. It connects to a Sophora Server to export data to XML. A flexible configuration allows to configure the part of the repository which is meant to be exported.
Capabilities
The Sophora Repo Exporter exports a part of a repository. The repository can hold a wide variety of documents and configuration data. Therefore the exporter can be configured to export exactly the data you want. The following parts can be configured and combined as you want:
- Complete administration areas
- Single items of administrative data
- Documents
Here is the list of administrative data that can be exported:
- Node types and their appropriate configuration
- Structure (and their hierarchy documents)
- A full set of all system documents
For the documents you can select precisely what documents you want to export:
- By ID (UUID, Sophora ID or external ID)
- By query (XPath or Solr)
- By tag
- By document type
- By date (modification date, creation date)
- By relative time since modification
- By structure path
- By document URL
- Referenced documents by defining a recursion level
- Only documents which have changed since the last export
For the simple exports from the DeskClient you can also configure which version of the Sophora-XML should be produced, what properties should be ignored and which text properties should be treated as references.
Installation
While installing the Sophora Repo Exporter, it is recommended to use the following folder hierarchy:
cms-install-directory
apps
sophora-exporter-1.54.0
sophora-exporter.sh
...
sophora-exporter > Symbolic link to sophora-exporter-1.54.0
...
repoexporter
config
sophora-exporter.json
logs
sophora-exporter.sh > Symbolic link to ../apps/sophora-exporter/sophora-exporter.sh
...
This hierarchy is analogous to the directory structure of the Sophora Server.
Configuration
By directing the Repo Exporter what to export you need to create a configuration file. In that file everything that should be exported will be defined. Most parts are optional and can be combined with all other optional parts.
Syntax
The format of the configuration file is JSON. This means that the whole configuration is surrounded by curly brackets ({}
). Each keyword is followed by a colon (:
) and optionally surrounded by quotes ("
). The values must be given in quotes, if it is text. Options that belong together are surrounded by curly brackets. A list of configuration options is given in square brackets ([]
). Different configuration blocks are separated by a comma (,
).
For example a single "documents" configuration which defines a list of UUIDs of documents to export is configured as followed.
Note that the first keyword "documents" is without and "uuids" is within quotes. This is just as an example that both is possible. The quotes are correct JSON syntax. Omitting them can be done for convenience. Choose the style that you prefer.
documents: [
{
"uuids": [
"78f91fd7-740e-419b-9698-29856b56f4d6",
"a595f746-d893-4c4c-8a25-fb82bd69f314"
]
}
]
Required Settings
You must specify to which Sophora Server the Repo Exporter should connect. You need to specify the URL for connecting and the username with the password for the login.
"sophoraServer": {
"host": "http://localhost:1196",
"username": "admin",
"password": "admin"
},
You also need to define to which folder files will be written. In that folder subfolders will be created for the different configuration blocks. The keyword is "exportDir":
"exportDir": "/path/to/exportData",
Optional Settings
Connection retries and read timeout can also be configured. The connectRetryInterval
and readTimeout
is specified in seconds.
"sophoraServer": {
"host": "http://localhost:1196",
"username": "admin",
"password": "admin",
"connectRetries": 3,
"connectRetryInterval": 5,
"readTimeout": 600
}
Export Settings
There are three main configuration parts in order to define data to export. Each has its own configuration block with its specific settings.
- Complete administration areas
- Single items of administrative data
- Documents
Administration Areas
Under "adminExport" you specify that a whole area of the administration view should be exported. The special keyword "full" exports all areas, which is the same as using the context menu item "Export all..." in the DeskClient.
Option | Description |
---|---|
full | The whole administration data is exported. This includes all options below. |
fullWithSeparatedNodeTypes | The whole administration data is exported, but CNDs and node type configurations are separated by node type. This includes all options below. |
nodetypes | All node types and their configuration (CND and node type configurations). This option includes used default configurations and form field groups. |
nodetypesMinimal | All node types and their configuration (CND and node type configurations). The files will be separated by node type. |
structure | All sites and their full structure (all structure nodes). |
categories | The legacy categories. |
allSystemDocuments | All system documents as select values, paragraph types and configuration documents (all documents with the mix-in sophora-mix:systemDocument). |
Example:
"adminExport": [
"nodetypes",
"structure",
"allSystemDocuments"
]
Administrative Data Elements
Use "adminElementExport" to export single items of administrative data. This can be a single node type or only a part of the structure. So for each option you can define a list of items which should be exported.
Option | Values | Description |
---|---|---|
nodetypes | List of document type names to export "full" | The CNDs, node type configurations, default configurations and form field groups of the given node types will be exported. Also includes referenced node types like super types, mixins and node types of virtual properties. |
nodetypesMinimal | List of document type names to export "minimal" | The CNDs and node type configurations of the given node types will be exported. The files will be separated by node type. |
structureNodes | List of UUIDs or paths | The given structure nodes and their substructure will be exported. |
exportRelevantIndexDocuments | true or false | For each exported structure node the default document will also be exported. (default false ) |
exportRelevantHierarchyDocuments | true or false | For each exported structure node the hierarchy document will also be exported. (default false ) |
Example:
"adminElementExport": {
"nodetypes": [
"example-nt:story"
],
"structureNodes": [
"eb55f5da-f4f8-4f18-8965-bfcaf9e9d10a",
"/test/structure/path"
]
}
Documents
In a "documents" block a list of multiple document filters can be specified. Inside a single block multiple keywords can be used to specify which documents are meant to be exported. All these keywords are ANDed together to collect documents.
Some of these keywords are ignored, if daemonMode=true
, for details check the following table.
Option | Values | Description | Available in daemonMode |
---|---|---|---|
uuids | List of UUIDs | Documents can directly be given by their UUID of the source repository. | yes |
externalIds | List of external IDs | Documents can directly be given by their unique external ID across repositories. | yes |
xpathQueries | List of XPath queries | A full JCR query which searches for arbitrary documents in the source repository. | yes |
solrQueries | List of Solr queries | A Solr query which searches for arbitrary documents in the source repository. | yes |
documentUrls | List of document URLs | Documents can be given by their URL from the delivery. Note that the Sophora Server must know the deliveries for this feature. | no |
tags | List of tags | All documents with any of the given tags will be exported. | yes |
indexDocuments | true or false | Export the default documents of all structure nodes (default false ). | no |
hierarchyDocuments | true or false | Export the hierarchy documents of all structure nodes (default false ). | no |
exportRelevantStructure | true or false | If true , for all exported documents their corresponding structure node hierarchy is exported (default false ). | no |
exportRelevantIndexDocuments | true or false | If true and exportRelevantStructure is also true , for each relevant structure node its default document will also be exported. (default false ). | no |
exportRelevantHierarchyDocuments | true or false | If true and exportRelevantStructure is also true , for each relevant structure node its hierarchy document will also be exported (default false ). | no |
criteria | List of criteria blocks | Specify documents by some criteria (see below). | yes |
recursionCriteria (Since 4.0.3) | List of criteria blocks | Documents that are exported recursively must meet these criteria (see below). | no |
maxRecursionDepth | integer | If greater than zero, referenced documents will be exported for each level until the given depth is reached. The referenced documents are exported in individual files. | yes |
url | URL | A web address which returns a documents configuration block. The defined options beside the URL will be merged with the fetched options. | yes |
subDir (Since 4.0.3) | Path | Defines a subfolder of the export folder in which the documents are exported. | no |
The options of a "criteria" block are ANDed together, so all given options must match a document. Different blocks can be listed which are ORed.
Option | Values | Description |
---|---|---|
documentTypes | node type names | Matches all documents with any of the given types |
structurePath | structure path | Matches all documents located below the given structure path. |
minModificationDate | date with optional time | Matches all documents modified since the given date (can also be combined with maxModificationDate) |
maxModificationDate | date with optional time | Matches all documents modified until the given date (can also be combined with minModificationDate) |
minCreationDate | date with optional time | Matches all documents created since the given date (can also be combined with maxCreationDate) |
maxCreationDate | date with optional time | Matches all documents created until the given date(can also be combined with minCreationDate) |
modifiedInLastDays | positive integer | Matches all documents modified in the last given days |
Example:
"documents": [
{
"uuids": [
"78f91fd7-740e-419b-9698-29856b56f4d6",
"a595f746-d893-4c4c-8a25-fb82bd69f314"
],
"exportRelevantStructure": true,
"exportRelevantIndexDocuments": true,
"exportRelevantHierarchyDocuments": true
}, {
"externalIds": [
"68ad8b0b-dc2e-4a29-a6aa-e22f582dfd6f",
"sophora.configuration.relevance",
"xpathQueries": [
"//element(*, sophora-content-nt:story)[@sophora-content:importend = 'true']"
]
}, {
"solrQueries": [
"sophora_id_s: le-bon-marche-logo100"
],
"criteria": [{
"modifiedInLastDays": 7
}]
}, {
"documentUrls": [
"http://www.subshell.com/demosite/trendcities/london/New-Shapes-New-Styles-from-London-City-Report100.html",
"http://www.subshell.com/demosite/trendcities/index.html"
]
}, {
"criteria": [{
"documentTypes": ["sophora-demo-nt:textfields", "sophora-demo-nt:datefields"],
"structurePath": "/demosite/home"
}]
}, {
"criteria": [{
"minModificationDate": "2014-08-01", "maxModificationDate": "2014-08-02"
}, {
"minCreationDate": "2014-06-01T12:34", "maxCreationDate": "2014-06-02T16:54:32.999"
}]
}, {
"tags": [
"vision",
"hamburg"
],
"uuids": [
"e6907385-410d-4611-a73b-736928fffe17"
],
"maxRecursionDepth": 3
}
]
Other Options
The following options can be given to fine-tune the export. They must be specified inside the main configuration block and cannot be nested. They apply to all exported elements.
Option | Values | Description |
---|---|---|
deltaExport | true or false (Default: false) | Only documents modified since the last export will be exported (again). |
daemonMode | true or false | Listens for document modifications and exports documents immediately if they match the options. Can be combined with deltaExport. The execution will continue until you wish to stop the Exporter. |
exportDocumentsWithTimestamp | true or false | Appends a timestamp to the export filenames |
propertiesNotToExportInSophoraXml | List of property names | These properties will not be exported to XML. If not defined a default set will be used. |
stringToReferenceProperties | Map of document types to a list of property names | For each document type you can specify string properties which hold a reference to other documents. |
groovyExtensionDir | Path to a directory | Groovy scripts in this directory can modify the created xml. They are executed for every document in the created xml files. |
maxRecursionDepthPerFile | integer | If greater than zero, referenced documents are exported for each level until the given depth is reached. The referenced documents are includes in the xml file of the main document. (Since 1.53.3, 1.54.4, 2.0.2, 2.1.0). Even referenced documents retrieved with documents[].maxRecursionDepth will contain as many referenced documents as configured here. |
includeLiveVersionInXml | true or false (Default: true) | When true, the live version of the exported document is included in the xml. |
xmlVersion | Version | The Sophora XML version to create, like "2.8". This can be used if the exported data is intended to be imported in a repository which runs an older Sophora version. If not set, the newest Sophora XML version will be used. |
Groovy scripts
Groovy scripts can modify the generated xml. The scripts are executed for each document included in the created xml files. The following packages are available:
- org.jdom2
- com.subshell.sophora.api
- com.subshell.sophora.api.content
- com.subshell.sophora.api.content.value
- com.subshell.sophora.api.exceptions
- com.subshell.sophora.api.nodetype
- com.subshell.sophora.api.search
- com.subshell.sophora.api.structure
The following variables are also accessible:
sophoraClient
: An instance of ISophoraClientconfig
: The exporter configurationdocument
: The INode of the exported documentxml
: The Jdom element of the created xml
This example script adds the url of the document in the created xml as an additional property:
def live = (document.getBoolean("sophora:isOnline") == true)
def url = sophoraClient.getDocumentUrl(document.getUUID(), live)
if (url != "") {
def ps = xml.getChild("properties", xml.getNamespace())
def p = new Element("property", xml.getNamespace())
p.setAttribute("name", "ext:sourceurl")
ps.addContent(p)
def v = new Element("value", xml.getNamespace())
p.addContent(v)
v.setText(url)
}
Exporting
To run the Sophora Repo Exporter you need a configuration file as described above. Execute the tool with the following command line in its folder:
# Linux or other UNIX-like OS
./sophora-exporter.sh
# Windows
sophora-exporter.bat
If you want to use a custom name of the configuration file, use the following command line:
# Linux or other UNIX-like OS
./sophora-exporter.sh -Dapp.config=myexporter-config.json
# Windows
sophora-exporter.bat -Dapp.config=myexporter-config.json
Replacing a Repository
The Sophora Repo Exporter also includes a script to replace a complete repository (reset_repository.sh
). This script will do the following steps:
- stop a local importer
- stop a local server
- backup the current repository in a backup folder with time stamp
- delete the cache and the log folder of the local server
- clear incoming/failure/success/temp folder of the importer
- restart the local server and importer
- start the Sophora Repo Exporter which will export a (remote) repository specified by its configuration (provided by the user)
- copy all exported data into the watchfolder of the importer
A configuration to export everything of a repository may look like this:
{
"sophoraServer": {
"host": "remote.host.com:1199",
"username": "admin",
"password": "admin"
},
"deltaExport": true,
"exportDir": "exportData",
"adminExport": [
"full"
],
"documents": [
{
"xpathQueries": [
"element(*, sophora-mix:document)"
]
}
]
}
sophora.importer.watchfolder.includeSubfolder=true
This has to be done because the exporter will create sub folders which include the actual exported XML files.