Azure Cognitive Search Index

From GM-RKB
Jump to navigation Jump to search

A Azure Cognitive Search Index is a search index for an Azure Cognitive Search service.



References

2023

  • https://learn.microsoft.com/en-us/azure/search/search-what-is-an-index
    • QUOTE: ... Field attributes determine how a field is used, such as whether it's used in full text search, faceted navigation, sort operations, and so forth.

      String fields are often marked as "searchable" and "retrievable". Fields used to narrow search results include "sortable", "filterable", and "facetable".

      • Attribute Description
      • "searchable" Full-text searchable, subject to lexical analysis such as word-breaking during indexing. If you set a searchable field to a value like "sunny day", internally it's split into the individual tokens "sunny" and "day". For details, see How full text search works.
      • "filterable" Referenced in $filter queries. Filterable fields of type Edm.String or Collection(Edm.String) don't undergo word-breaking, so comparisons are for exact matches only. For example, if you set such a field f to "sunny day", $filter=f eq 'sunny' finds no matches, but $filter=f eq 'sunny day' will.
      • "sortable" By default the system sorts results by score, but you can configure sort based on fields in the documents. Fields of type Collection(Edm.String) can't be "sortable".
      • "facetable" Typically used in a presentation of search results that includes a hit count by category (for example, hotels in a specific city). This option can't be used with fields of type Edm.GeographyPoint. Fields of type Edm.String that are filterable, "sortable", or "facetable" can be at most 32 kilobytes in length. For details, see Create Index (REST API).
      • "key" Unique identifier for documents within the index. Exactly one field must be chosen as the key field and it must be of type Edm.String.
      • "retrievable" Determines whether the field can be returned in a search result. This is useful when you want to use a field (such as profit margin) as a filter, sorting, or scoring mechanism, but don't want the field to be visible to the end user. This attribute must be true for key fields.
    • Although you can add new fields at any time, existing field definitions are locked in for the lifetime of the index. For this reason, developers typically use the portal for creating simple indexes, testing ideas, or using the portal pages to look up a setting. Frequent iteration over an index design is more efficient if you follow a code-based approach so that you can rebuild the index easily.

2023

  • https://learn.microsoft.com/en-us/training/modules/intro-to-azure-search/2c-understand-index
    • QUOTE: An Azure Cognitive Search index can be thought of as a container of searchable documents. Conceptually you can think of an index as a table and each row in the table represents a document. Tables have columns, and the columns can be thought of as equivalent to the fields in a document. Columns have data types, just as the fields do on the documents.
    • Index schema

      In Azure Cognitive Search, an index is a persistent collection of JSON documents and other content used to enable search functionality. The documents within an index can be thought of as rows in a table, each document is a single unit of searchable data in the index.

      The index includes a definition of the structure of the data in these documents, called its schema. An example of an index schema with AI-extracted fields keyphrases and imageTags is below:

{
 "name": "index",
 "fields": [
   {
     "name": "content", "type": "Edm.String", "analyzer": "standard.lucene", "fields": []
   }
   {
     "name": "keyphrases", "type": "Collection(Edm.String)", "analyzer": "standard.lucene", "fields": []
   },
   {
     "name": "imageTags", "type": "Collection(Edm.String)", "analyzer": "standard.lucene", "fields": []
   },
]
}