Version Store

Migration Guide

Technical notes on the Sophora 4 to Sophora 5 migration.

Sophora 4 to Sophora 5

The Sophora Server 5 introduces the new Version Store to contain all the versions of all documents - excluding the working versions - that have been part of the main repository and the archive repository in older versions.

Since Staging Servers don't contain any document versions besides the working version, they don't have nor need a Version Store and therefore are not affected by the migration guide.

A Sophora Server in 5 will implicitly store all versions created or synchronized in the new Version Store, but there is no automatic migration of versions. A full sync is required to get all the existing data into a new version store.

Migration order overview

This section is intended to give a broader overview on how to update all of your servers from version 4 to 5 while properly setting up the version store.

First, configure a new Sophora Replica. This server should be launched with the latest release of Sophora Server 5 and be connected to a Postgres database as described in the Installation and Configuration overview.

When this server is started, it will connect to your current Sophora Primary server and start synchronizing all documents and all versions into the new format.

After the new replica server is successfully synced and up to date, you can stop it and make a backup of its repositories and Postgres database which can then be used to give all other servers a properly migrated setup. For more information refer to the "Backups" section.

Before upgrading your primary we recommend to upgrade your Sophora Staging Servers to 5. Apart from the breaking changes listed in the Sophora Server Update Notes, Sophora Staging Servers are not affected by the introduction of the Version Store. Most notably the derby driver changed from org.apache.derby.iapi.jdbc.AutoloadedDriver to org.apache.derby.jdbc.EmbeddedDriver. In fact, Sophora Staging Servers don't have a Version Store and do not need a Postgres database (They will require a Postgres database in Sophora 6). Simply restart the Sophora Staging Server with version 5. A full sync is not required.

Afterwards, you can migrate your other replicas. Migrating here means, that you set them up as empty replica servers and give them copies of the repository and Postgres database from the server has underwent the full sync.

Just do this one server at the time and make sure the servers are all running properly before proceeding with the next server.

Once all of your Replica servers and Staging servers are running with Sophora 5, you can stop your current Primary, promote one of the Replicas to the new Primary and then start your former Primary in 5 with a migrated repository as you did with the other replicas before.

Monitoring the sync

As the full sync really takes some time, you might want to have a more detailled look on how the sync is going and whether everything is running properly. The sync will undergo several iterations called delta-syncs. Each iteration goes through various different phases. The most important phases are:

  • Synchronizing structure nodes
  • Synchronizing system documents
  • Synchronizing all other documents
  • Synchronizing yellow data

Each of these phases will be logged with a progress bar in the logs of the primary (or at least the server that was primary when the sync has been started) and the corresponding replica.

At the end of each iterations, the replica will log either of these messages:

  • SyncRequest completed. Starting new Delta-SyncRequest
  • All syncs completed, switching to regular replication

After the latter, the overall sync is done.

All the data of the four mentioned categories above (structure, system, documents, yellow data) are synchronized and ordered by their modification date. As a result, the most recent document the replica has seen at any given moment in time during the sync should be slowly going from the earliest documents in your system to the most recent ones. The server offers prometheus metrics for more insights on the most recent dates under the servers /pormetheus endpoint. The unit of the timestamps is in milliseconds since epoch (1970):

# HELP sophora_server_syncRequest_modificationDate Modification date for synchronisation of documents
# TYPE sophora_server_syncRequest_modificationDate gauge
sophora_server_syncRequest_modificationDate{application="Sophora Server",group="system",} 1.690901954072E12
sophora_server_syncRequest_modificationDate{application="Sophora Server",group="yellow data",} 1.690901914079E12
sophora_server_syncRequest_modificationDate{application="Sophora Server",group="document",} 1.604400674676E12
sophora_server_syncRequest_modificationDate{application="Sophora Server",group="cnd",} 1.692686909962E12
sophora_server_syncRequest_modificationDate{application="Sophora Server",group="structureNode",} 1.678890648114E12 

Backups

Creating copies of the JCR repositories (excluding the Version Store) is similar to earlier version of Sophora. These copies cover parts of the server's file system and their counterpart in external data bases. Just be aware that servers in version 5 no longer have an included solr or an archive repository, so there is just less to copy. In addition you have to make backups of the content of the Postgres database. In order to achieve that, several tools from the PostgreSQL environment may be used to dump the content of the database and insert it in the new one.

Backup and Restore with pg_dump

The following example uses pg_dump and pg_restore on localhost:5432 with the user postgres and database sophora.

Backup

pg_dump -h localhost -p 5432 -U postgres -Fc sophora > db.dump

Restore

pg_restore -h localhost -p 5432 -U postgres -d sophora db.dump

For more information refer to https://www.postgresql.org/docs/current/app-pgdump.html and https://www.postgresql.org/docs/current/app-pgrestore.html.

Last modified on 8/13/24

The content of this page is licensed under the CC BY 4.0 License. Code samples are licensed under the MIT License.

Icon