Mapping
Overview
After creating Search Index, you need to define mapping which is schema with defined data types for your dataset in document type search index or configuration for suggestions generation in suggestion type search index.
Document type index mapping
Different data types are dedicated for different purposes so make sure they match type of your dataset attribute and desired purpose. Attributes configured in search index mapping can be used in search query configuration. You do not need to provide types for all attributes of your dataset, just make sure you defined attributes that you want to use later in search configuration.
Available types:
Available mapping types are listed below. Searchable-only attribute used only for search and such attributes cannot be used for filtering, sorting, faceting. Aggregable-only attribute used only for aggregation operations (filtering, sorting, faceting) but not optimized for text search.
Text data types:
- text - for searchable-only attributes.
- keyword - for aggregable-only attributes.
- text_keyword - for searchable attributes that can also be used in aggregation operations.
- text_simplified - for searchable-only attributes that have simplified search matching rules (no partial matches or spellcheck).
- hierarchy - for nested attributes that has structural nesting. For example: categories and subcategories.
- date:
- strings containing formatted dates, e.g. "2021-01-01" or "2021/01/01 12:10:30".
- a number representing milliseconds-since-the-epoch.
Numeric data types:
- long - a signed 64-bit integer with a minimum value of -263 and a maximum value of 263-1.
- integer - a signed 32-bit integer with a minimum value of -231 and a maximum value of 231-1.
- short - a signed 16-bit integer with a minimum value of -32,768 and a maximum value of 32,767.
- byte - a signed 8-bit integer with a minimum value of -128 and a maximum value of 127.
- double - a double-precision 64-bit IEEE 754 floating point number, restricted to finite values.
- float - a single-precision 32-bit IEEE 754 floating point number, restricted to finite values.
Other data types:
boolean - boolean field accepts
true
orfalse
values, but can also accept strings which are interpreted as either true or false. Boolean types support sorting, but not aggregations.- False values:
false
,"false"
,""
(empty string); - True values:
true
,"true"
.
- False values:
readonly - a readonly property can be of any type, but can only be used in query select fields. No search, aggregation or boosting actions can be performed on such field. Use this property for large readonly product fields to speed up indexing process.
flattened - for objects that can be used for faceting or filtering but won't be used in search. Nested objects within this field can be accessed with object dot notation (e.g.
variants.size.label
). Subfields of this type mapped askeyword
and onlyterms
faceting and filtering allowed. Example of a field that can have a flattened type:{"variants": [{"size": {"value": 1, "label": "One"}}, {"size": {"value": 2, "label": "two"}}]}
.
Mapping properties
Suggestable / Spellcheck
suggestable
is a boolean mapping field, which defines that the given property should be used to generate available spellcheck suggestion if user query yelds no or a small number of results (configured as Did You Mean functionality, in Query Configuration)
{
"id": {
"type": "keyword"
},
"title": {
"type": "text",
"suggestable": true
},
"brand": {
"type": "text_keyword",
"suggestable": true
}
}
suggestable
Can only be set on text
or text_keyword
type fields.
Language
It is possible to override Search Index language for each individual field in the mapping:
{
"id": {
"type": "keyword"
},
"title_lt": {
"type": "text",
"language": "lt"
},
"title_en": {
"type": "text_keyword",
"language": "en-us"
}
}
Such fields, when used in Search Query configuration, will only match the grammar and stemming rules of that particular language, which is useful, if you want to support multiple language searches in a single Search Query.
A list of supported language can be found here.
Language can only be set on text
or text_keyword
type fields.
Skip number char split
By default, when analyzing terms, LupaSearch splits segments that have conjoined numbers and letters. For example, the term abc123
would be split into abc
and 123
. This allows searches for 123
, abc
, 123 abc
, abc_123
to successfully match the term.
However, in some cases, you may want to disable this behavior (for example, when searches return too many low-relevance results). To do this, set the skipNumberCharSplit
property to true
in the mapping:
{
"id": {
"type": "keyword"
},
"title": {
"type": "text",
"skipNumberCharSplit": true
}
}
Ignore Whitespace and Symbols (for Product Codes)
For fields like code
or other technical identifiers, you may want to ignore whitespace and symbols during searches. For example, if you have a field with a value of 123-456
, you may want to match it with a search for 123456
; or if the value is +3706666666
, you may want to match it with a search for +(370) 6666-666
.
To do this, you can set the ignoreWhitespaceAndSymbols
property to true
in the mapping of a chosen field:
{
"id": {
"type": "keyword"
},
"code": {
"type": "text",
"ignoreWhitespaceAndSymbols": true
}
}
When to use 'text', 'keyword' and 'text_keyword' properties
All of these properties are used for simple and complex text (string) fields, but they slightly differ in functionality.
Use
text
property for fields that you want to include in query fields for full text search (where all selected language grammar rules are applied), but won't use for sorting and aggregation (facets). Examples could include product title (if there is no requirement for products to be sorted by name) and description;Use
keyword
field for properties that you want to include in sorting options and facets, but are not required for full text search. Examples include product id, sku and tags. It is important to note that such fields can still be used in search, but since grammar rules are not applied, only exact matches will be returned;Use
text_keyword
field to combinetext
andkeyword
functionality for properties that require both full text search and aggregation with sorting. Examples include categories, brands and other fields.use
text_simplified
field for long searchable text strings, where you don't need spellcheck or partial matches. Useful to improve searching and indexing performance.
How to set mapping for new document type search index Docs
POST /v1/indices/{indexId}/mapping
{
"id": {
"type": "keyword"
},
"title": {
"type": "text"
},
"brand": {
"type": "text_keyword"
},
"category": {
"type": "hierarchy"
},
"rating": {
"type": "integer"
}
}
How to update mapping for existing document type search index Docs
PUT /v1/indices/{indexId}/mapping
{
"id": {
"type": "keyword"
},
"title": {
"type": "text"
},
"brand": {
"type": "text_keyword"
},
"category": {
"type": "hierarchy"
},
"rating": {
"type": "integer"
}
}
Suggestion type index mapping
See Suggestion Mapping.
Dynamic mapping
Suppose you have the following fields in your product feed:
{
"id": 15,
"attr_color": ["Red", "Blue"],
"attr_tag": "Shirts",
"attr_weight": "100",
"attr_barcode": "1234567890"
}
Dynamic mapping allows the definition of common property fields (those that start with the same characters) with a single mapping type.
To use dynamic mapping property, append an asterisk (*
) to the end of the common part of the variable property names:
{
"id": {
"type": "keyword"
},
"attr_*": {
"type": "keyword"
}
}
The dynamic property asterisk *
can only be used at the end of the property name. Therefore, dynamic mapping fields like *_attr
or attr_*_field1
are invalid.
Dynamic mapping types can be used in query configurations in the same manner as static mapping properties. However, there are some caveats, which are described in the Dynamic Properties section.