Mapping

Overview

After creating Search Index, you need to define mapping which is schema with defined data types for your dataset in document type search index or configuration for suggestions generation in suggestion type search index.

Document type index mapping

Different data types are dedicated for different purposes so make sure they match type of your dataset attribute and desired purpose. Attributes configured in search index mapping can be used in search query configuration. You do not need to provide types for all attributes of your dataset, just make sure you defined attributes that you want to use later in search configuration.

Available types:

Available mapping types are listed below. Searchable-only attribute used only for search and such attributes cannot be used for filtering, sorting, faceting. Aggregable-only attribute used only for aggregation operations (filtering, sorting, faceting) but not optimized for text search.

Text data types:

text - for searchable-only attributes.
keyword - for aggregable-only attributes.
text_keyword - for searchable attributes that can also be used in aggregation operations.
text_simplified - for searchable-only attributes that have simplified search matching rules (no partial matches or spellcheck).
hierarchy - for nested attributes that has structural nesting. For example: categories and subcategories.
date:
- strings containing formatted dates, e.g. "2021-01-01" or "2021/01/01 12:10:30".
- a number representing milliseconds-since-the-epoch.

Numeric data types:

long - a signed 64-bit integer with a minimum value of -2⁶³ and a maximum value of 2⁶³-1.
integer - a signed 32-bit integer with a minimum value of -2³¹ and a maximum value of 2³¹-1.
short - a signed 16-bit integer with a minimum value of -32,768 and a maximum value of 32,767.
byte - a signed 8-bit integer with a minimum value of -128 and a maximum value of 127.
double - a double-precision 64-bit IEEE 754 floating point number, restricted to finite values.
float - a single-precision 32-bit IEEE 754 floating point number, restricted to finite values.

Other data types:

boolean - boolean field accepts true or false values, but can also accept strings which are interpreted as either true or false. Boolean types support sorting, but not aggregations.
- False values: false, "false", ""(empty string);
- True values: true, "true".
readonly - a readonly property can be of any type, but can only be used in query select fields. No search, aggregation or boosting actions can be performed on such field. Use this property for large readonly product fields to speed up indexing process.
flattened - for objects that can be used for faceting or filtering but won't be used in search. Nested objects within this field can be accessed with object dot notation (e.g. variants.size.label). Subfields of this type mapped as keyword and only terms faceting and filtering allowed. Example of a field that can have a flattened type: {"variants": [{"size": {"value": 1, "label": "One"}}, {"size": {"value": 2, "label": "two"}}]}.

Mapping properties

Suggestable / Spellcheck

suggestable is a boolean mapping field, which defines that the given property should be used to generate available spellcheck suggestion if user query yelds no or a small number of results (configured as Did You Mean functionality, in Query Configuration)

{
  "id": {
    "type": "keyword"
  },
  "title": {
    "type": "text",
    "suggestable": true
  },
  "brand": {
    "type": "text_keyword",
    "suggestable": true
  }
}

suggestable Can only be set on text or text_keyword type fields.

Language

It is possible to override Search Index language for each individual field in the mapping:

{
  "id": {
    "type": "keyword"
  },
  "title_lt": {
    "type": "text",
    "language": "lt"
  },
  "title_en": {
    "type": "text_keyword",
    "language": "en-us"
  }
}

Such fields, when used in Search Query configuration, will only match the grammar and stemming rules of that particular language, which is useful, if you want to support multiple language searches in a single Search Query.

A list of supported language can be found here.

Language can only be set on text or text_keyword type fields.

Skip number char split

By default, when analyzing terms, LupaSearch splits segments that have conjoined numbers and letters. For example, the term abc123 would be split into abc and 123. This allows searches for 123, abc, 123 abc, abc_123 to successfully match the term.

However, in some cases, you may want to disable this behavior (for example, when searches return too many low-relevance results). To do this, set the skipNumberCharSplit property to true in the mapping:

{
  "id": {
    "type": "keyword"
  },
  "title": {
    "type": "text",
    "skipNumberCharSplit": true
  }
}

Ignore Whitespace and Symbols (for Product Codes)

For fields like code or other technical identifiers, you may want to ignore whitespace and symbols during searches. For example, if you have a field with a value of 123-456, you may want to match it with a search for 123456; or if the value is +3706666666, you may want to match it with a search for +(370) 6666-666.

To do this, you can set the ignoreWhitespaceAndSymbols property to true in the mapping of a chosen field:

{
  "id": {
    "type": "keyword"
  },
  "code": {
    "type": "text",
    "ignoreWhitespaceAndSymbols": true
  }
}

When to use 'text', 'keyword' and 'text_keyword' properties

All of these properties are used for simple and complex text (string) fields, but they slightly differ in functionality.

Use text property for fields that you want to include in query fields for full text search (where all selected language grammar rules are applied), but won't use for sorting and aggregation (facets). Examples could include product title (if there is no requirement for products to be sorted by name) and description;
Use keyword field for properties that you want to include in sorting options and facets, but are not required for full text search. Examples include product id, sku and tags. It is important to note that such fields can still be used in search, but since grammar rules are not applied, only exact matches will be returned;
Use text_keyword field to combine text and keyword functionality for properties that require both full text search and aggregation with sorting. Examples include categories, brands and other fields.
use text_simplified field for long searchable text strings, where you don't need spellcheck or partial matches. Useful to improve searching and indexing performance.

How to set mapping for new document type search index Docs

POST /v1/indices/{indexId}/mapping
{
  "id": {
    "type": "keyword"
  },
  "title": {
    "type": "text"
  },
  "brand": {
    "type": "text_keyword"
  },
  "category": {
    "type": "hierarchy"
  },
  "rating": {
    "type": "integer"
  }
}

How to update mapping for existing document type search index Docs

PUT /v1/indices/{indexId}/mapping
{
  "id": {
    "type": "keyword"
  },
  "title": {
    "type": "text"
  },
  "brand": {
    "type": "text_keyword"
  },
  "category": {
    "type": "hierarchy"
  },
  "rating": {
    "type": "integer"
  }
}

Suggestion type index mapping

See Suggestion Mapping.

Dynamic mapping

Suppose you have the following fields in your product feed:

{
  "id": 15,
  "attr_color": ["Red", "Blue"],
  "attr_tag": "Shirts",
  "attr_weight": "100",
  "attr_barcode": "1234567890"
}

Dynamic mapping allows the definition of common property fields (those that start with the same characters) with a single mapping type.

To use dynamic mapping property, append an asterisk (*) to the end of the common part of the variable property names:

{
  "id": {
    "type": "keyword"
  },
  "attr_*": {
    "type": "keyword"
  }
}

The dynamic property asterisk * can only be used at the end of the property name. Therefore, dynamic mapping fields like *_attr or attr_*_field1 are invalid.

Dynamic mapping types can be used in query configurations in the same manner as static mapping properties. However, there are some caveats, which are described in the Dynamic Properties section.