Repo Exporter 3

Repo Exporter Guide

The Sophora Repo Exporter makes it easy to export parts of a repository. This is useful i.e. for copying current live data to a fresh test system.

The Sophora Repo Exporter is a standalone tool. It connects to a Sophora Server to export data to XML. A flexible configuration allows to configure the part of the repository which is meant to be exported.

Capabilities

The Sophora Repo Exporter exports a part of a repository. The repository can hold a wide variety of documents and configuration data. Therefore the exporter can be configured to export exactly the data you want. The following parts can be configured and combined as you want:

  • Complete administration areas
  • Single items of administrative data
  • Documents

Here is the list of administrative data that can be exported:

  • Node types and their appropriate configuration
  • Structure (and their hierarchy documents)
  • Roles
  • Users
  • A full set of all system documents

For the documents you can select precisely what documents you want to export:

  • By ID (UUID, Sophora ID or external ID)
  • By query (XPath or Solr)
  • By tag
  • By document type
  • By date (modification date, creation date)
  • By relative time since modification
  • By structure path
  • By document URL
  • Referenced documents by defining a recursion level
  • Only documents which have changed since the last export

For the simple exports from the DeskClient you can also configure which version of the Sophora-XML should be produced, what properties should be ignored and which text properties should be treated as references.

Installation

While installing the Sophora Repo Exporter, it is recommended to use the following folder hierarchy:

cms-install-directory
		apps
				sophora-exporter-1.54.0
						sophora-exporter.sh
						...
				sophora-exporter > Symbolic link to sophora-exporter-1.54.0
				...
		repoexporter
				config
						sophora-exporter.json
				logs
				sophora-exporter.sh > Symbolic link to ../apps/sophora-exporter/sophora-exporter.sh
		...

This hierarchy is analogous to the directory structure of the Sophora Server.

Configuration

By directing the Repo Exporter what to export you need to create a configuration file. In that file everything that should be exported will be defined. Most parts are optional and can be combined with all other optional parts.

Syntax

The format of the configuration file is JSON. This means that the whole configuration is surrounded by curly brackets ({}). Each keyword is followed by a colon (:) and optionally surrounded by quotes ("). The values must be given in quotes, if it is text. Options that belong together are surrounded by curly brackets. A list of configuration options is given in square brackets ([]). Different configuration blocks are separated by a comma (,).

For example a single "documents" configuration which defines a list of UUIDs of documents to export is configured as followed.
Note that the first keyword "documents" is without and "uuids" is within quotes. This is just as an example that both is possible. The quotes are correct JSON syntax. Omitting them can be done for convenience. Choose the style that you prefer.

documents: [
		{
			"uuids": [
				"78f91fd7-740e-419b-9698-29856b56f4d6",
				"a595f746-d893-4c4c-8a25-fb82bd69f314"
			]
		}
	]

Required Settings

You must specify to which Sophora Server the Repo Exporter should connect. You need to specify the URL for connecting and the username with the password for the login.

"sophoraServer": {
		"host": "http://localhost:1196",
		"username": "admin",
		"password": "admin"
	},

You also need to define to which folder files will be written. In that folder subfolders will be created for the different configuration blocks. The keyword is "exportDir":

"exportDir": "/path/to/exportData",

Optional Settings

Connection retries and read timeout can also be configured. The connectRetryInterval is specified in seconds.

"sophoraServer": {
 		"host": "http://localhost:1196",
 		"username": "admin",
 		"password": "admin",
 		"connectRetries": 3,
 		"connectRetryInterval": 5
}

Export Settings

There are three main configuration parts in order to define data to export. Each has its own configuration block with its specific settings.

  1. Complete administration areas
  2. Single items of administrative data
  3. Documents

Administration Areas

Under "adminExport" you specify that a whole area of the administration view should be exported. The special keyword "full" exports all areas, which is the same as using the context menu item "Export all..." in the DeskClient.

Options for "adminExport"
OptionDescription
fullThe whole administration data is exported. This includes all options below. Users will be exported with password hashes.
fullWithoutPasswordsThe same as 'full' but without user passwords.
fullWithSeparatedNodeTypesThe whole administration data is exported, but CNDs and node type configurations are separated by node type. This includes all options below.
Since version 3.0.6.
nodetypesAll node types and their configuration (CND and node type configurations). This option includes used default configurations and form field groups.
nodetypesMinimalAll node types and their configuration (CND and node type configurations). The files will be separated by node type.
Since version 3.0.6.
structureAll sites and their full structure (all structure nodes).
categoriesThe legacy categories.
allSystemDocumentsAll system documents as select values, paragraph types and configuration documents (all documents with the mix-in sophora-mix:systemDocument).
rolesAll roles.
usersAll users with their hashed passwords.
usersWithoutPasswordsAll users without their passwords.

Example:

"adminExport": [
		"nodetypes",
		"structure",
		"allSystemDocuments"
	]

Administrative Data Elements

Use "adminElementExport" to export single items of administrative data. This can be a single node type, specific users or only a part of the structure. So for each option you can define a list of items which should be exported.

Options for "adminElementExport"
OptionValuesDescription
nodetypesList of document type namesThe CNDs, node type configurations, default configurations and form field groups of the given node types will be exported. Also includes referenced node types like super types, mixins and node types of virtual properties.
nodetypesMinimalList of document type names to export "minimal"The CNDs and node type configurations of the given node types will be exported. The files will be separated by node type.
Since version 3.0.6.
structureNodesList of UUIDs or pathsThe given structure nodes and their substructure will be exported.
exportRelevantIndexDocumentstrue or falseFor each exported structure node the default document will also be exported. (default false)
exportRelevantHierarchyDocumentstrue or falseFor each exported structure node the hierarchy document will also be exported. (default false)
rolesList of UUIDsThe given roles with their configuration.
usersWithPasswordsList of user namesThe given users will be exported with all its configuration including the hashed passwords.
usersWithoutPasswordsList of user namesThe given users will be exported without their passwords.

Example:

"adminElementExport": { 
    	"nodetypes": [
			"example-nt:story"
    	],
    	"structureNodes": [
			"eb55f5da-f4f8-4f18-8965-bfcaf9e9d10a",
			"/test/structure/path"
    	]
    }

Documents

In a "documents" block a list of multiple document filters can be specified. Inside a single block multiple keywords can be used to specify which documents are meant to be exported. All these keywords are ANDed together to collect documents.
Some of these keywords are ignored, if daemonMode=true, for details check the following table.

Options for "documents"
OptionValuesDescriptionAvailable in daemonMode
uuidsList of UUIDsDocuments can directly be given by their UUID of the source repository.yes
externalIdsList of external IDsDocuments can directly be given by their unique external ID across repositories.yes
xpathQueriesList of XPath queriesA full JCR query which searches for arbitrary documents in the source repository.yes
solrQueriesList of Solr queriesA Solr query which searches for arbitrary documents in the source repository.yes
documentUrlsList of document URLsDocuments can be given by their URL from the delivery. Note that the Sophora Server must know the deliveries for this feature.no
tagsList of tagsAll documents with any of the given tags will be exported.yes
indexDocumentstrue or falseExport the default documents of all structure nodes (default false).no
hierarchyDocumentstrue or falseExport the hierarchy documents of all structure nodes (default false).no
exportRelevantStructuretrue or falseIf true, for all exported documents their corresponding structure node hierarchy is exported (default false).no
exportRelevantIndexDocumentstrue or falseIf true and exportRelevantStructure is also true, for each relevant structure node its default document will also be exported. (default false).no
exportRelevantHierarchyDocumentstrue or falseIf true and exportRelevantStructure is also true, for each relevant structure node its hierarchy document will also be exported (default false).no
criteriaList of criteria blocksSpecify documents by some criteria (see below).yes
recursionCriteria (Since 3.1.5)List of criteria blocksDocuments that are exported recursively must meet these criteria (see below).no
maxRecursionDepthintegerIf greater than zero, referenced documents will be exported for each level until the given depth is reached. The referenced documents are exported in individual files.yes
urlURLA web address which returns a documents configuration block. The defined options beside the URL will be merged with the fetched options.yes
subDir (Since 3.1.5)PathDefines a subfolder of the export folder in which the documents are exported.no

The options of a "criteria" block are ANDed together, so all given options must match a document. Different blocks can be listed which are ORed.

Options for "criteria"
OptionValuesDescription
documentTypesnode type namesMatches all documents with any of the given types
structurePathstructure pathMatches all documents located below the given structure path.
minModificationDatedate with optional timeMatches all documents modified since the given date (can also be combined with maxModificationDate)
maxModificationDatedate with optional timeMatches all documents modified until the given date (can also be combined with minModificationDate)
minCreationDatedate with optional timeMatches all documents created since the given date (can also be combined with maxCreationDate)
maxCreationDatedate with optional timeMatches all documents created until the given date(can also be combined with minCreationDate)
modifiedInLastDayspositive integerMatches all documents modified in the last given days

Example:

"documents": [
		{
			"uuids": [
				"78f91fd7-740e-419b-9698-29856b56f4d6",
				"a595f746-d893-4c4c-8a25-fb82bd69f314"
			],
			"exportRelevantStructure": true,
			"exportRelevantIndexDocuments": true,
			"exportRelevantHierarchyDocuments": true
		}, {
			"externalIds": [
				"68ad8b0b-dc2e-4a29-a6aa-e22f582dfd6f",
				"sophora.configuration.relevance",
				"xpathQueries": [
					"//element(*, sophora-content-nt:story)[@sophora-content:importend = 'true']"
				]
		}, {
			"solrQueries": [
				"sophora_id_s: le-bon-marche-logo100"
			],
			"criteria": [{
				"modifiedInLastDays": 7
			}]
		}, {
			"documentUrls": [
				"http://www.subshell.com/demosite/trendcities/london/New-Shapes-New-Styles-from-London-City-Report100.html",
				"http://www.subshell.com/demosite/trendcities/index.html"
			]
		}, {
			"criteria": [{
				"documentTypes": ["sophora-demo-nt:textfields", "sophora-demo-nt:datefields"],
				"structurePath": "/demosite/home"
			}]
		}, {
			"criteria": [{
				"minModificationDate": "2014-08-01", "maxModificationDate": "2014-08-02"
			}, {
				"minCreationDate": "2014-06-01T12:34", "maxCreationDate": "2014-06-02T16:54:32.999"
			}]
		}, {
			"tags": [
				"vision",
				"hamburg"
			],
			"uuids": [
				"e6907385-410d-4611-a73b-736928fffe17"
			],
			"maxRecursionDepth": 3
		}
	]

Other Options

The following options can be given to fine-tune the export. They must be specified inside the main configuration block and cannot be nested. They apply to all exported elements.

OptionValuesDescription
deltaExporttrue or false (Default: false)Only documents modified since the last export will be exported (again).
daemonModetrue or falseListens for document modifications and exports documents immediately if they match the options. Can be combined with deltaExport. The execution will continue until you wish to stop the Exporter.
exportDocumentsWithTimestamptrue or falseAppends a timestamp to the export filenames
propertiesNotToExportInSophoraXmlList of property namesThese properties will not be exported to XML. If not defined a default set will be used.
stringToReferencePropertiesMap of document types to a list of property namesFor each document type you can specify string properties which hold a reference to other documents.
groovyExtensionDirPath to a directoryGroovy scripts in this directory can modify the created xml. They are executed for every document in the created xml files.
maxRecursionDepthPerFileintegerIf greater than zero, referenced documents are exported for each level until the given depth is reached. The referenced documents are includes in the xml file of the main document. (Since 1.53.3, 1.54.4, 2.0.2, 2.1.0). Even referenced documents retrieved with documents[].maxRecursionDepth will contain as many referenced documents as configured here.
includeLiveVersionInXmltrue or false (Default: true)When true, the live version of the exported document is included in the xml.
xmlVersionVersionThe Sophora XML version to create, like "2.8". This can be used if the exported data is intended to be imported in a repository which runs an older Sophora version. If not set, the newest Sophora XML version will be used.

Groovy scripts

Groovy scripts can modify the generated xml. The scripts are executed for each document included in the created xml files. The following packages are available:

  • org.jdom2
  • com.subshell.sophora.api
  • com.subshell.sophora.api.content
  • com.subshell.sophora.api.content.value
  • com.subshell.sophora.api.exceptions
  • com.subshell.sophora.api.nodetype
  • com.subshell.sophora.api.search
  • com.subshell.sophora.api.structure

The following variables are also accessible:

  • sophoraClient: An instance of ISophoraClient
  • config: The exporter configuration
  • document: The INode of the exported document
  • xml: The Jdom element of the created xml

This example script adds the url of the document in the created xml as an additional property:

def live = (document.getBoolean("sophora:isOnline") == true)
def url = sophoraClient.getDocumentUrl(document.getUUID(), live)
if (url != "") {
	def ps = xml.getChild("properties", xml.getNamespace())
	def p = new Element("property", xml.getNamespace())
	p.setAttribute("name", "ext:sourceurl")
	ps.addContent(p)
	def v = new Element("value", xml.getNamespace())
	p.addContent(v)
	v.setText(url)
}

Exporting

To run the Sophora Repo Exporter you need a configuration file as described above. Execute the tool with the following command line in its folder:

# Linux or other UNIX-like OS
./sophora-exporter.sh
# Windows
sophora-exporter.bat

If you want to use a custom name of the configuration file, use the following command line:

# Linux or other UNIX-like OS
./sophora-exporter.sh -Dapp.config=myexporter-config.json
# Windows
sophora-exporter.bat  -Dapp.config=myexporter-config.json

Replacing a Repository

The Sophora Repo Exporter also includes a script to replace a complete repository (reset_repository.sh). This script will do the following steps:

  • stop a local importer
  • stop a local server
  • backup the current repository in a backup folder with time stamp
  • delete the cache and the log folder of the local server
  • clear incoming/failure/success/temp folder of the importer
  • restart the local server and importer
  • start the Sophora Repo Exporter which will export a (remote) repository specified by its configuration (provided by the user)
  • copy all exported data into the watchfolder of the importer

A configuration to export everything of a repository may look like this:

{
	"sophoraServer": {
		"host": "remote.host.com:1199",
		"username": "admin",
		"password": "admin"
	},

	"deltaExport": true,

	"exportDir": "exportData",

	"adminExport": [
		"full"
	],

	"documents": [
		{
			"xpathQueries": [
				"element(*, sophora-mix:document)"
			]
		}
	]
}

Last modified on 2/15/23

The content of this page is licensed under the CC BY 4.0 License. Code samples are licensed under the MIT License.

Icon