The Sophora Delivery coordinates the delivery of and provides access to the managed content.The following figure displays the processes relevant for the delivery:
The Apache HTTP Server receives the
HTTP requests to the Sophora content. First they check the cache for pages that are already generated. If no fragments are found the request is forwarded to the
tomcat containing the delivery webapplication. This is usually done using mod_proxy. The generated content will be returned to the browser and will be stored in the cache store. So it is accessible for the Apache directly during the next request.
There are two ways to generate content in the delivery. Within a website these content generation techniques can be combined:
- Pro-active Pre-Generation of cached content in the docroot of the Apache HTTP Server
- On-demand generation of cached content in the docroot of the Apache HTTP Server
By pre-generating HTML and image files the delivery's performance can be enhanced significantly. The cache entries are generated in the background so that a dynamic website can be delivered with a likewise perfomance of a static site.
For content that isn't requested very often, like a site's archive, on-demand generation might be sufficient because not every content needs to be kept in the cache. With on-demand generation
the according delivery formats are created as soon as they are requested by a user. When created those files are written to the webserver's docroot so that later requests don't require another on-demand generation.
The delivery of content starts with a request, usually a
HTTP request and ends in a rendered resource like
CSS stylesheets, scripts or a JSON data structure. The following figure shows the components involved with the sophora lifecyle and describes their tasks in this process.
HTTP request is sent to a webserver. In a live environment an Apache HTTP server will usually be used for performance reasons (a tomcat servlet engine provides the same functionality and is sufficient in a development environment). The HTTP server checks the cache store for an already generated version of the requested content. If a matching fragment is found, this fragment will be returned. Otherwhise the request is forwarded to the servlet engine, containing the portal application (delivery). This is usually done via
mod_proxy over HTTP.
Before any template is called the request passes the
sophora-filter-chain. The filters interpret the request, initialize and provide some business objects like the
cache-facade or the
contentProvider, load the corresponding Sophora content and find the template that should be used to render the document.
The delivery distinguishes between a request to a resource or to Sophora content. A resource is a file that is present in the webapplication like static
scripts. If such a resource is found in the app-base of the webapplication the reource will be returned. If a
jsp with the corresponding name is found, the result of this
jsp will be returned.
If no matching resource is found, the
request-URL will be interpreted according to the Sophora URL pattern. Depending on the
node-type the request is forwarded to a template that should be used to render the current content. The mapping between the
node-type and the template is defined in the
template.xml. The result of the template, e.g. a rendered
html page will be stored as a fragment in the
store using the
cache facade. The dependency between the
request URI, the mapped template, the required resources to render the fragment will be stored in the
The generated fragment may include
SSI-commands that will lead to subrequests in which the
SSI-commands will be resolved. The fragments will be put together to a page and sent back in the
HTTP response. In a live environment this is done by the Apache HTTP Server using
mod_include. In a development environment this will usually be done by the servlet engine via the
The website's content is delivered using descriptive URLs.
A document's URL consists of the its path (derived from the location in the structure tree) and the Sophora ID (the ID stem together with a counter). Keep in mind that in some cases additional paramters are required.
Sophora URLs are used to request Sophora Content through the delivery Application, usually over http. The URLs contain several information that are read by the sophora filterchain and translated into implicit objects that can be used in the templates. For example the URL
http://demo.sophoracms.com/trendcities/copenhagen/cphvision/cphimpressions100~small_v-teaser.jpg will be split into the following parts:
demo.sophoracms.com: the site where the requested document should be rendered
/trendcities/copenhagen/cphvision/: the complete structure-path unter which the template should be rendered (this path may differ from the path under which the document is located in the repository)
cphimpressions100: the Sophora id of the requested document
~small: the template type that will be used. The template-type can be defined in the templates.xml.
_v-teaser: additional parameters that are needed to render the template. In this case a parameter with the name 'v' and the value 'teaser' will be accessible in the templates.
.jpg: the suffix (needed to determine the mime-type of the requested content)
The described URL is a URL to a
mounted system. This is the typical setup in a live environment. The Request addresses an Apache HTTP Server that will forward it to the Tomcat where the web-application is deployed. Thereby the URL will be rewritten. The domain or hostname will be changed into the hostname where the tomcat is installed and the webapplication's context path and the site name will be added to the URL. The rewritten URL might then look like:
- tomcat.demo.sophoracms.com:8080 is the hostname and port under which tomcat is accessible
- sophora-demosite-live: is the name of the webapplication context
- demosite: is the name of the requested site according to the name of the site in the repository
Sophora provides one codec to create and interpret URLs: The
DefaultSophoraCodec. If this codec does not match your needs, you can create your own URL codec. See section templates.xml for details.
The DefaultSophoraCodec allows digits in the id stem. The SophoraId and the templateType are separated by ~. Urls that are generated with the defaultUrlCodec have the following structure:
- not mounted:
To add a validation mechanism for URL parameters the property
sophora.delivery.urlCodec.validateUrlParams can be set to true. If this property is set, the Sophora Taglib checks whether there are any parameters when creating an URL. If so, the delivery creates the checksum of all URL parameters and appends it to the designated URL. For example an URL like
/node/file100~parameter-1.html results in /node/file100~_parameter-1-9f727126373e44a2fd0ed552d62d7f6d296430f3.html.
If a request is sent to the Tomcat server and the property sophora.delivery.urlCodec.validateUrlParams is true, it is checked whether the checksum matches the specified URL parameter. If the check fails, a 404 error is returned. Without URL parameters in the created URL, no checksum is created.
This validation mechanism provides some additional security, because URLs can neither be changed manually nor can new parameters be added to an URL without knowing the correct checksum.
Calling an URL which contains a Sophora ID may result in a "404 - Document not Found" error, if no document can be found for the desired ID. In this case the delivery saves this ID, in order to avoid further requests to the Sophora server for this particular ID. This way up to 1000 (If nothing else is configured) Sophora IDs, which caused this error, are saved.
If this happens, you can see the following entry in the
Adding sophoraId 'test102' to ignoreList (current size: 1)
Or if the ID is removed again:
Removing sophoraId 'test102' from ignoreList (current size: 0)
Document changes are updating the ignore list, so that e.g. Sophora IDs of newly created documents are removed from the ignore list.
JMX can be used for checking the size and Sophora IDs situated in the ignore list at a specific point of time (see MBean
SophoraIdIgnoreList). In addition it is possibile to remove any or all Sophora IDs, and setting the cacheSize via JMX.
Each request is filtered by several servlet filters within the web application. Depending on the configuration, filters may be omitted or disabled. The
Sophora-Delivery provides a set of filters that is required in order to render Sophora content. Additionally some optional filters can be used. You can find a description of these filters in the section Additional (optional) Filters.
More information about the configuration of the filters can be found in the section Filters
This filter is active, if the Tomcat is not running inside a webserver. The Tomcat itself needs to be configured accordingly (see Configuration Parameters). This filter triggers SSI statements using an instance of the
HttpClient. By default the filter only reacts on responses whose content type starts with "text" or "application/xml". This behaviour can be altered by changing the property
contentTypePrefixes in the web.xml file of this filter. Here, you can enter several content type prefixes, separated by comma.
If caching is active, this filter checks whether the requested content has already been generated and is available in the
cache store. In this case, the according content will be returned (and remaining filters are omitted). If the content has not been generated, the remaining filters are applied. The parameter encoding defines the encoding of the response. If this parameter hasn't been set, Tomcat's default encoding is used. The encoding of the delivered HTML file will not be changed by the
cache filter, because this only affects the header.
Generates the content provider and puts it into the
request scope. The content provider can be accessed using the
DeliveryUtils and can be used to access the server via the
Parses the URL and executes the shortcut mechanism. This fiter also sets the values of the
currentStructureNodeUuid of the request. By default every URL that is handed to this filter is parsed. It is possible to exclude URLs with certain suffixes from being intrepreted. This is usually done for logical URL mappings e.g. for
servlets that do neither have a corresponding resource in the context root of the webapplication nor a corresponding document in the repository.
Handles the upload of multipart form data, usually a file upload. Enables definition of a maximum file size for uploads. When using this filter, uploaded files are available through the request variable
Writes all exceptions that occur during processing of requests to the logfile.
This filter removes all empty lines from the output and trims all other lines. It is applied to all servlet responses with the mime type text.
Additionally, it is applied to all application mime types with the subtype xml or with subtypes beginning with
xml+, xhtml+, rss+ or
rdf+. You can alter this default behavior by setting the filter params
applicationSubtypePrefixes. The specific types must be separated by comma, for example
By default the filter will not trim lines between
<pre ...> and
<textarea ...> and
</textarea>. This behaviour can be overridden by setting the param
ignoredTags and passing all tags to be ignored seperated by comma (default:
For detailed information about configuration of the filters, please refer to the configuration section.
The Sophora Delivery performs many I/O operations on the hard disk. The cache queue and HTML fragments are stored in the file system, the dependencies of Sophora documents to the HTML fragments are stored in the cache database and are kept up-to-date, and so on. Therefore the underlying system requires fast I/O components, e.g. hard disks/RAID with (enabled) write-cache, to prevent the I/O to slow down or even block the delivery.