Documents

Overview

LupaSearch is a software as a service product that is running on the cloud and requires your product feed to be stored on our servers. This page describes different ways you can use the documents API to keep your products in sync with the LupaSearch database.

Document is any searchable unit that can be queried. Usually, documents are your products, but you can use LupaSearch to store and query product categories, tags, blog posts, documentation and any other object that is required for your website.

Important: in order to import documents to the search index, make sure that the index has defined mapping - a contract that describes and configures the types of your document fields.

Important: each imported document must contain an id field.

Adding / updating documents

API Reference: Index documents.

To import new documents to the index, issue an HTTP POST request with an array of your documents:

POST /indices/{indexId}/documents
{
  "documents": [
    {
      "id": 1,
      "name": "TV Samsung",
      "category": ["Electronics"],
      "rating": 5
    },
    {
      "id": 2,
      "name": "Refridgerator Beko",
      "category": ["Appliances", "Electronics"],
      "rating": 4
    }
  ]
}

If document with defined id already exists in the index, existing item will be replaced with a new entry i.e. using this endpoint, existing properties will be removed if they are not present in a new document. You can use a PATCH request if you need a partial update.

If you have a large amount of documents, you can split the import into multiple requests. Maximum single request body size should not exceed 10 MB.

Document import response will include a success result and task batch key:

{
  "success": true,
  "batchKey": "reindex-1627373737-abc"
}

Once document request is finished, entries are added into an internal document import queue. To track import progress, you can use Task API, with the batchKey filter from the documents response.

Partial updates

API Reference: Patch documents.

In cases where you need to perform a partial update for each document, it is possible to use HTTP PATCH request with the same schema.

Partial update is useful if you only need to update a small subset of document properties, like price, discounts or stock.

PATCH /v1/indices/{indexId}/documents
{
  "documents": [
    {
      "id": 1,
      "price": 2.99
    },
    {
      "id": 2,
      "in_stock": 8
    }
  ]
}

Deleting documents

API Reference: Delete documents.

To delete documents, issue a HTTP POST request to document deletion endpoint with a list of deletable document ids:

POST /v1/indices/{indexId}/documents/batchDelete
{
  "ids": ["1", "2"]
}

Retrieving document count

API Reference: Get document count.

It is possible to retrieve current document count in an index using HTTP GET endpoint:

GET /v1/indices/{indexId}/documents/count

The response will include current factual document count in a given index. This count does not include documents that are being reindexed.

{
  "count": 1000
}

Full document replacement

API Reference: Replace all documents.

If you need to update all of the documents at once, you can use "Replace all documents" endpoint.

This will perform a full document replacement in your index. New documents will be accessible to the search only when all of the document import tasks are finished.

If your document count is large, it is recommended to split the request into multiple batches, and use finished flag to indicate that all of the import batches have been sent.

If finished flag is not sent, or is indicated as false, LupaSearch will acknowledge the documents, but will not open the new items to the search, until {finished: true} request is sent.

POST /v1/indices/{indexId}/documents/replaceAll

First and subsequent requests:

{
  "documents": [
    {
      "id": 1,
      "name": "TV Samsung",
      "category": ["Electronics"],
      "rating": 5
    },
    {
      "id": 2,
      "name": "Refrigerator Beko",
      "category": ["Appliances", "Electronics"],
      "rating": 4
    }
  ],
  "finished": false
}

Last request:

{
  "documents": [
    {
      "id": 9999,
      "name": "TV Lg",
      "category": ["Electronics"],
      "rating": 3
    },
    {
      "id": 10000,
      "name": "Refrigerator",
      "category": ["Appliances"],
      "rating": 2
    }
  ],
  "finished": true
}

Full document replacement endpoint response includes the same batch key schema as an ordinary document import. In full document replacement case, tasks will include one special task of type document-replacement-end, which status will indicate whether full document reindex is completed.

Cancel full document replacement

API Reference: Delete Temporary Index.

If for any reason you need to cancel document replacement while it is in progress (until document-replacement-end task is not finished, or search index field reindexInProgress contains value true), and start over, you can use HTTP DELETE request to cancel reindex (document replacement) which will delete the hidden temporary index:

DELETE /v1/indices/{indexId}/temporary