Overview
Linkchecker is a Spring Boot Java app with an embedded Tomcat. It primarily offers a REST interface at http://localhost:8080/api/...
Requirements
Linkchecker needs Java 11 (other versions or the use of an embedded Java upon request.) It also depends on Sophora 4 or higher (other versions upon request.)
Installing and Running
Linkchecker consists of a single JAR file. To start Link Checker, run the following:
java -jar JARFILE
The configuration file application.yaml must be placed in the current working directory.
Using Docker
To run Linkchecker as a Docker container, use the following command:
docker run \
-p 8080:8080 \
--mount "type=bind,source=$(pwd)/application.yaml,target=/application.yaml" \
docker.subshell.com/sophora/linkchecker
Configuration
The following is an example application.yaml file.
Please note that the configuration options for the connection to the Sophora Server are listed in the Spring Boot Sophora Commons documentation.
# Settings for Sophora server connection.
sophora:
client:
server-connection:
urls: http://sophora.example.com:1196
username: user
password: pass
link-checker:
# List of link document types to check. Each link document type must have a 'node-type' and a 'property' field.
documents:
- node-type: sophora-extension-nt:link
property: sophora-extension:url
# Cron expression determining when to run link checks.
#
# Default: "@midnight"
#
# cron: "@midnight"
# URL to check internet connectivity. This URL is only checked every 10 min.
internet-check-uri:
# Page size to use when searching for link documents.
#
# Default: 500
#
# find-documents-page-size: 500
# Set documents offline if a link is broken?
#
# Default: true
#
# broken-set-offline: true
# Text to use to set the internal comment on documents if a link is broken.
#
# Default: Link nicht erreichbar
#
# broken-comment: Link nicht erreichbar
# Treat link timeouts as broken?
#
# Default: true
#
# timeout-broken: true
http:
# Timeout for HTTP connections (msec).
#
# Default: 5000
#
# connect-timeout: 5000
# Timeout for reading HTTP responses (msec).
#
# Default: 10000
#
# read-timeout: 10000
# Try HTTP HEAD requests first? If the HEAD request fails, a GET request is made.
#
# Default: true
#
# try-head-first: true
# List of HTTP status codes that should be treated as working instead of broken.
#
# Default: (empty list)
#
# additional-working-status-codes:
# - 500 # Internal Server Error
# - 503 # Service Unavailable
# List of mappings from structure paths to proposal section names.
# One of the mappings may omit the structure path. This mapping will be used as the default.
#
# Default:
# - proposal-section: Broken Links
#
# proposal-sections:
# - proposal-section: Broken Links
# - structure-path: /demosite
# proposal-section: Broken Links (Demo Site)
Adding to a Proposal Section
If link-checker.proposal-sections
is set, broken links are added to the proposal section that is mapped to the closest structure path according to the link document's structure node. If a broken link is found to be working again, it is removed from the proposal section.
The named proposals sections must exist prior to starting the Linkchecker.
Time Scheduling with Cron Configurations
The Cron expression in the parameter link-checker.cron
determines when the Linkchecker will inspect working and broken links.
Please consult the Spring documentation for details on the syntax of these expressions.