Linkchecker 4

Administration

Learn how to install, configure and run Linkchecker.

Overview

Linkchecker is a Spring Boot Java app with an embedded Tomcat. It primarily offers a REST interface at http://localhost:8080/api/...

Requirements

Linkchecker needs Java 11 (other versions or the use of an embedded Java upon request.) It also depends on Sophora 4 or higher (other versions upon request.)

Installing and Running

Linkchecker consists of a single JAR file. To start Link Checker, run the following:

java -jar JARFILE

The configuration file application.yaml must be placed in the current working directory.

Using Docker

To run Linkchecker as a Docker container, use the following command:

docker run \
  -p 8080:8080 \
  --mount "type=bind,source=$(pwd)/application.yaml,target=/application.yaml" \
  docker.subshell.com/sophora/linkchecker

Configuration

The following is an example application.yaml file.

Please note that the configuration options for the connection to the Sophora Server are listed in the Spring Boot Sophora Commons documentation.

# Settings for Sophora server connection.
sophora:
  client:
    server-connection:
      urls: http://sophora.example.com:1196
      username: user
      password: pass

link-checker:
  # List of link document types to check. Each link document type must have a 'node-type' and a 'property' field.
  documents:
    - node-type: sophora-extension-nt:link
      property: sophora-extension:url

  # Cron expression determining when to run link checks.
  #
  # Default: "@midnight"
  #
  # cron: "@midnight"

  # URL to check internet connectivity. This URL is only checked every 10 min.
  internet-check-uri:

  # Page size to use when searching for link documents.
  #
  # Default: 500
  #
  # find-documents-page-size: 500

  # Set documents offline if a link is broken?
  #
  # Default: true
  #
  # broken-set-offline: true

  # Text to use to set the internal comment on documents if a link is broken.
  #
  # Default: Link nicht erreichbar
  #
  # broken-comment: Link nicht erreichbar

  # Treat link timeouts as broken?
  #
  # Default: true
  #
  # timeout-broken: true

  http:
    # Timeout for HTTP connections (msec).
    #
    # Default: 5000
    #
    # connect-timeout: 5000

    # Timeout for reading HTTP responses (msec).
    #
    # Default: 10000
    #
    # read-timeout: 10000

    # Try HTTP HEAD requests first? If the HEAD request fails, a GET request is made.
    #
    # Default: true
    #
    # try-head-first: true

    # List of HTTP status codes that should be treated as working instead of broken.
    #
    # Default: (empty list)
    #
    # additional-working-status-codes:
    #   - 500 # Internal Server Error
    #   - 503 # Service Unavailable

  # List of mappings from structure paths to proposal section names.
  # One of the mappings may omit the structure path. This mapping will be used as the default.
  #
  # Default:
  #   - proposal-section: Broken Links
  #
  # proposal-sections:
  #   - proposal-section: Broken Links
  #   - structure-path: /demosite
  #     proposal-section: Broken Links (Demo Site)

Adding to a Proposal Section

If link-checker.proposal-sections is set, broken links are added to the proposal section that is mapped to the closest structure path according to the link document's structure node. If a broken link is found to be working again, it is removed from the proposal section.

The named proposals sections must exist prior to starting the Linkchecker.

Time Scheduling with Cron Configurations

The Cron expression in the parameter link-checker.cron determines when the Linkchecker will inspect working and broken links.

Please consult the Spring documentation for details on the syntax of these expressions.

Last modified on 10/16/20

The content of this page is licensed under the CC BY 4.0 License. Code samples are licensed under the MIT License.

Icon