Allows integrators to configure which fields should be used to produce field "content" for indexed pages. Before only "bodytext" was used. This is now configurable and "header" was added to defaults. Resolves: #134
4.4 KiB
Indexing
Holds settings regarding the indexing, e.g. of TYPO3 records, to search services.
Configured as:
plugin {
tx_searchcore {
settings {
indexing {
identifier {
indexer = FullyQualifiedClassname
// the settings
}
}
}
}
}
Where identifier
is up to you, but should match table
names to make TcaIndexer
work.
The following settings are available. For each setting its documented which indexer consumes it.
rootLineBlacklist
Used by: TcaIndexer
,
PagesIndexer
.
Defines a blacklist of page uids. Records below any of these pages, or subpages, are not indexed. This allows you to define areas that should not be indexed. The page attribute No Search is also taken into account to prevent indexing records from only one page without recursion.
Contains a comma separated list of page uids. Spaces are trimmed.
Example:
plugin.tx_searchcore.settings.indexing.pages.rootLineBlacklist = 3, 10, 100
additionalWhereClause
Used by: TcaIndexer
,
PagesIndexer
.
Add additional SQL to where clauses to determine indexable records
from the table. This way you can exclude specific records like
tt_content
records with specific CType
values
or something else.
Example:
plugin.tx_searchcore.settings.indexing.tt_content.additionalWhereClause = tt_content.CType NOT IN ('gridelements_pi1', 'list', 'div', 'menu')
Attention
Make sure to prefix all fields with the corresponding table name. The selection from database might contain joins and can lead to SQL errors if a field exists in multiple tables.
abstractFields
Used by: PagesIndexer
.
Note
Will be migrated to dataprocessors
in the future.
Define which field should be used to provide the auto generated field
"search_abstract". The fields have to exist in the record to be indexed.
Therefore fields like content
are also possible.
Example:
# As last fallback we use the content of the page
plugin.tx_searchcore.settings.indexing.pages.abstractFields := addToList(content)
Default:
abstract, description, bodytext
contentFields
Used by: PagesIndexer
.
Define which fields should be used to provide the auto generated field "content".
Example:
plugin.tx_searchcore.settings.indexing.pages.contentFields := addToList(table_caption)
Default:
header, bodytext
mapping
Used by: connection_elasticsearch
connection while
indexing.
Define mapping for Elasticsearch, have a look at the official docs: https://www.elastic.co/guide/en/elasticsearch/reference/5.2/mapping.html You are able to define the mapping for each property / column.
Example:
plugin.tx_searchcore.settings.indexing.tt_content.mapping {
CType {
type = keyword
}
}
The above example will define the CType
field of
tt_content
as type: keyword
. This makes
building a facet possible.
index
Used by: connection_elasticsearch
connection while
indexing.
Define index for Elasticsearch, have a look at the official docs: https://www.elastic.co/guide/en/elasticsearch/reference/5.2/indices-create-index.html
Example:
plugin.tx_searchcore.settings.indexing.tt_content.index {
analysis {
analyzer {
ngram4 {
type = custom
tokenizer = ngram4
char_filter = html_strip
filter = lowercase, asciifolding
}
}
tokenizer {
ngram4 {
type = ngram
min_gram = 4
max_gram = 4
}
}
}
}
char_filter
and filter
are a comma
separated list of options.
dataProcessing
Used by: All connections while indexing, due to implementation inside
AbstractIndexer
.
Configure modifications on each document before sending it to the
configured connection. For full documentation check out dataprocessors
.