You can create as many indexes in Solr as you need. To do that, create a new index configuration by selecting Administration view > Index Configurations > New: Index Configuration from the context menu. This will let you create a new system document for a new index.
Note that an index only becomes active in Solr if the index configuration document is published. To delete an index from the Solr instance, simply set the index configuration document offline.
The following lists the properties of the index configuration document:
|Name||The index name. This will also be used by the Solr server to store the index in the filesystem.|
|Only published versions||Only published version will be stored in the index.|
|Deleted documents||Deleted documents will be stored in the index.|
|Available on server||Which servers the index will be available on. Note that the index will always be available on the Sophora Primary (Master) Server.|
|Nodetypes||Only documents having the specified nodetypes will be stored in the index.|
|Structurenodes||Only documents located in the specified structure nodes will be stored in the index.|
|Channels||Only documents valid for the selected channels will be stored in the index. Will store all documents if no channel is selected.|
|Filter Script||Only documents matching this Groovy filter script will be stored in the index. Will store all documents if no script is specified.|
The script must return true/false indicating whether the current document should be stored in the index, for example:
// this will filter out all documents not having the "myProp" property
The following predefined variables may be used the script:
document (INode) - the document in question
contentManager (IContentManager) - a content manager that can be used to get more information from the server
sessionToken (SessionToken) - a session token to be used with content manager calls
|Reindex Search Query||Only documents specified in this query will be reindexed. This will prevent the index from getting reindexed completely when the index configuration is published a second time.|
The query is written in JCR format. For example, the following will only reindex a certain document type:
@jcr:primaryType = 'sophora-nt:myType'
Another use case would be only reindexing documents that have a certain value stored in a property. The following will only reindex documents having the "myProp" property with a value of "test":
Reindexing queries are commonly intended to be used just once by the next reindexing task. However, once set they will apply to all upcoming tasks if not removed. Since this might be unintentional, a warning will appear as a reminder that a reindexing query is still present and will affect upcoming reindexing tasks.
|Mapping Document||One or more custom index mappings to use, see below.|
|Remove after days||Documents older than this number of days will automatically be removed from the index.|
|Remove after days reference||Specifies the date properties that serve as reference for removing documents after a number of days. The order is important: The second property in the list is only evaluated when the first property is not set, and so on.|
When storing documents in the Solr index, their properties must be converted to store them into index fields. This is done by a "mapping." Sophora implements a default mapping for all property types and child nodes. Administrators may configure new fields to be added to index documents
The Solr index may not only contain Solr index documents for sophora nodes, but also for child nodes if those child nodes are configured as rows of dynamic tables. To distinguish between the different index document types, each Solr index document has a
solr_document_type_s field which can be set to the following value:
- empty or set to
node, in which case it is a normal sophora node
- set to
childNodein which case it is the child node of a sophora node
The following lists the default mappings between data sources and their respective Solr index fields for sophora documents. The
solr_document_type_s field will be set to
node or, in an upgrade scenario, not be present at all.
|Data Source||Solr Index Field|
|Document's UUID||Field "id"|
|Index document type||Field "solr_document_type_s" (the field is either not present or the value is set to "node")|
|Document's node type||Field "primaryType_s"|
|Document's structure node's path||Field "structureNode_path_s"|
|Document's structure node's alias||Field "structureNode_alias_s"|
|Document's structure node's default document's UUID||Field "structureNode_defaultDocumentUuid_s"|
|Document's structure node's hierarchy UUIDs||Field "structureNode_hierarchyUuids_ss"|
|Time/date of indexing the document||Field "indexedDate_dt"|
|String property||Field "NAME_s" for regular text fields, "NAME_t" for text fields having more than one row, or "NAME_txt" multi-valued string properties.|
|Long property||Field "NAME_l" for regular long properties, or "NAME_ls" for multi-valued long properties.|
|Double property||Field "NAME_d" for regular double properties, or "NAME_ds" for multi-valued double properties.|
|Date property||Field "NAME_dt" for regular date properties, or "NAME_dts" for multi-valued date properties|
|All other property types||Field "NAME_s" for regular properties, or "NAME_ss" for multi-valued properties. The property is converted to its string representation first.|
|Copytext child node||Field "copytext_t" containing complete plain text.|
|All components||Field "childnode_reference_uuid_ss" containing all UUIDs of referenced documents.|
|Properties of all components||Field "childnode_content_t" containing all the properties of all referenced documents. This does not include any properties in the "sophora"-Namespace, nor does it include any copytext. Relevant properties are converted to their string representations first.|
|Documents loaded from the server via getDocument*() calls||Field "included_uuid_ss" containing all UUIDs of documents that were loaded from the server via getDocument*() calls.|
The following lists the default mappings between data sources and their respective Solr index fields for child nodes. The
solr_document_type_s field will be set to
|Data Source||Solr Index Field|
|Unique index document id||Field "id"|
|Index document type||Field "solr_document_type_s" (the value is set to "childNode")|
|UUID of the parent document||Field "parentNode_uuid_s"|
|Primary type of the parent document||Field "parentNode_primaryType_s"|
|Structure hierachy uuids of the parent document||Field "parentNode_structureNode_hierarchyUuids_ss"|
|Primary type of the child node||Field "childNode_primaryType_s"|
|Name of the child node in the parent node's configuration||Field "childNode_name_s"|
|Index of this child node in the list of child nodes with this child node name.||Field "childNode_index_i"|
Furthermore, all properties of the child node are mapped in the same way as in the default mapping for sophora documents.
In the event that Sophora's default mapping is not sufficient, administrators may opt to add customized mappings. These may either add new fields or override default fields.
To add a new custom mapping, create a new index mapping by selecting Administration view > Index Mappings > New: Index Mapping from the context menu. This will let you create a new system document for a new index mapping.
The following lists the properties of the index mapping document:
|Name||The index mapping name.|
|Indexfields||One or more custom index fields that are active when this mapping is used in an index configuration, see below.|
|Channel affiliations||The solr fields channel_names_ss and channel_uuids_ss contain the channels which are enabled for the document if the channel is enabled here. The difference to sophora_enabledChannels_ss is that it contains information which are inherited from structure nodes.|
A built-in index named "default" contains data from all documents in the repository. It is, e.g., used for DeskClient searches. A second build-in index is named "default-live". It contains all documents in their latest live version. These build-in indices cannot be configured using regular index configurations. However, admins can add custom index fields to these indices using the "Default-Core-Mapping" index mapping document, which can be found at Administration view > Solr. For instance, custom index fields within the default index can be used for custom search result orders (see Search Result Order Options in the Work Environment Configuration).
If a sophora document includes one or more dynamic tables, each row of the dynamic table will be represented by a separate Solr index document.
To actually create new fields (or override existing ones) in a Solr index document, there needs to be a specification of how data is converted to fill the new field. To create a new index field, select Administration view > Index Fields > New: Index Field from the context menu. This will let you create a new system document for a new index field.
The following lists the properties of the index field document:
|Label||The field's label.|
|Fieldname||The field's name. This name is used to store data into the Solr index.|
|Script||A Groovy script that returns data to be stored into the index field.|
The script must return an Object that represents the data to store into the field:
- null (the field will not be indexed)
- Classes corresponding to primitive types, e.g. Integer, Double, Boolean
- Collections of the above types (for multivalued fields)
|document||INode||the document in question|
|nodeType||NodeType||the document's node type|
|contentManager||IContentManager||a content manager that can be used to get more information from the server|
|sessionToken||SessionToken||a session token to be used with content manager calls|
|String stripRichText(String text)||Strips HTML/XML tags and some special characters from richtext fields.|
|IContent getNearestHierarchyDocument(IContent document)||Searches the parent structure nodes of the document for the first hierarchy document.|
|List derefChildNodes(IContent parent, String childNodeName)||Dereferences all child nodes with the given name. The method first reads a UUID from the property "sophora:reference" in each child node and then loads the referenced document. Child nodes without a "sophora:reference" property and external references are silently ignored. The returned documents don't contain binary data properties.|
|String getPropertyStringValueFromStructureNodeDocument(IContent document, String propertyName)||Searches all structure node documents in the structure node hierarchy of the document for the given string property. If no property is found, 'null' is returned.|
|List getSelectValueLabels(IContent document, String propertyName)||Returns the labels of the select value of the given property. If the property is single-valued, the returned list will contain at most one element. If the property is multi-valued, the list will contain at most one label for each property value. If no label is found for a property value, none is returned.|
|List getSelectValueLabels(IContent document, NodeType nodeType, String propertyName)||Returns the labels of the select value of the given property for the given node type. If the property is single-valued, the returned list will contain at most one element. If the property is multi-valued, the list will contain at most one label for each property value. If no label is found for a property value, none is returned.|
|String getDocumentUrl(UUID uuid)||Returns the delivery-side URL to a document.|
Whenever a document is modified that should be stored in the index (according to the index configuration), this document is held synchronized with the index. If any changes are made to the document, it will be reindexed automatically to hold the index up to date. If a document is deleted, it will automatically be deleted from the index if "Deleted documents" is not checked in the index configuration.
When using custom index fields, the respective Groovy scripts may call getDocument*() methods on the content manager. Whenever one of the documents returned by these methods changes, the document using the custom index field will be reindexed.
Whenever a structure node is changed, all documents located in that structure node (and its children) will be reindexed automatically.
Whenever an index configuration is published, the complete index will be rebuilt. In this case, rebuilding will take place in a temporary index so that the current index can still be used for searches. Once rebuilding is done, the temporary index becomes the new index, and the old index is deleted. Note, that publishing an index mapping document or an index field document will not trigger a rebuild of the associated indexes.
When changing and publishing a channel, index mapping or index field, the index configurations which have to be republished are marked red in the administration view. Furthermore a sticky note is created. The color of the sticky node is red by default. To change the color you can add an entry with the key sophora.configuration.republishIndexConfigurationColor and for example the value 255,255,0 in the configuration document.
Solr Block Joins offer better performance than regular joins. Since sophora server release 4, 3.0.1 and 2.5.35 documents are indexed so block join queries may be used. When a core is fully reindexed block joins can be enabled. To do so you have to set the key blockjoin.enabled.<core> to true in the configuration document.
After that block joins will be used for the epg view in the deskclient (epg-model since 2.5.13, 3.0.1 and 4), in the epg-taglib (since sophora-delivery 2.5.37, 3.0.1 and 4) and in the server (since 2.5.35).
To search for documents in your own index (Solr core), you have to put it in the search parameters. To achieve this, the class
SolrSearchParameters is used. This class allows to configure the Solr query. Use
setCore(corename) to specify the index for the search. Check the JavaDoc of that class for more information.
See the following code snippet:
// You can create any query, like NodeTypeQuery or PropertyQuery (except for XPathQuery which cannot query Solr) // With a SolrQuery you can query your own index fields IQuery query = new SolrQuery("species_reversed_s: \"goD\"") // use the special SolrSearchParameters instead of the common SearchParameters SolrSearchParameters parameters = new SolrSearchParameters() parameters.setPageSize(10) // set the index/core to search in parameters.setCore("Animalcore") // execute the search ISophoraClient client = ... UuidSearchResult searchResult = client.findDocumentUuids(query, parameters) // do something with the result List<UUID> uuids = searchResult.getUUIDs() ...