Type alias Options<T>

Options<T>: {
    autoSuggestOptions?: SearchOptions;
    autoVacuum?: boolean | AutoVacuumOptions;
    extractField?: ((document, fieldName) => string);
    fields: string[];
    idField?: string;
    logger?: ((level, message, code?) => void);
    processTerm?: ((term, fieldName?) => string | string[] | null | undefined | false);
    searchOptions?: SearchOptions;
    storeFields?: string[];
    tokenize?: ((text, fieldName?) => string[]);
}

Configuration options passed to the MiniSearch constructor

Type Parameters

  • T = any

    The type of documents being indexed.

Type declaration

  • Optional autoSuggestOptions?: SearchOptions

    Default auto suggest options (see the SearchOptions type and the MiniSearch#autoSuggest method for details)

  • Optional autoVacuum?: boolean | AutoVacuumOptions

    If true (the default), vacuuming is performed automatically as soon as MiniSearch#discard is called a certain number of times, cleaning up obsolete references from the index. If false, no automatic vacuuming is performed. Custom settings controlling auto vacuuming thresholds, as well as batching behavior, can be passed as an object (see the AutoVacuumOptions type).

  • Optional extractField?: ((document, fieldName) => string)

    Function used to extract the value of each field in documents. By default, the documents are assumed to be plain objects with field names as keys, but by specifying a custom extractField function one can completely customize how the fields are extracted.

    The function takes as arguments the document, and the name of the field to extract from it. It should return the field value as a string.

    Remarks

    The returned string is fed into the tokenize function to split it up into tokens.

      • (document, fieldName): string
      • Parameters

        • document: T
        • fieldName: string

        Returns string

  • fields: string[]

    Names of the document fields to be indexed.

  • Optional idField?: string

    Name of the ID field, uniquely identifying a document.

  • Optional logger?: ((level, message, code?) => void)

    Function called to log messages. Arguments are a log level ('debug', 'info', 'warn', or 'error'), a log message, and an optional string code that identifies the reason for the log.

    The default implementation uses console, if defined.

      • (level, message, code?): void
      • Parameters

        • level: LogLevel
        • message: string
        • Optional code: string

        Returns void

  • Optional processTerm?: ((term, fieldName?) => string | string[] | null | undefined | false)

    Function used to process a term before indexing or search. This can be used for normalization (such as stemming). By default, terms are downcased, and otherwise no other normalization is performed.

    The function takes as arguments a term to process, and the name of the field it comes from. It should return the processed term as a string, or a falsy value to reject the term entirely.

    It can also return an array of strings, in which case each string in the returned array is indexed as a separate term.

    Remarks

    During the document indexing phase, the first step is to call the extractField function to fetch the requested value/field from the document. This is then passed off to the tokenizer, which will break apart each value into "terms". These terms are then individually passed through the processTerm function to compute each term individually. A term might for example be something like "lbs", in which case one would likely want to return [ "lbs", "lb", "pound", "pounds" ]. You may also return a single string value, or a falsy value if you would like to skip indexing entirely for a specific term.

    Truthy return value(s) are then fed to the indexer as positive matches for this document. In our example above, all four of the [ "lbs", "lb", "pound", "pounds" ] terms would be added to the indexing engine, matching against the current document being computed.

    Note: Whatever values are returned from this function will receive no further processing before being indexed. This means for example, if you include whitespace at the beginning or end of a word, it will also be indexed that way, with the included whitespace.

      • (term, fieldName?): string | string[] | null | undefined | false
      • Parameters

        • term: string
        • Optional fieldName: string

        Returns string | string[] | null | undefined | false

  • Optional searchOptions?: SearchOptions

    Default search options (see the SearchOptions type and the MiniSearch#search method for details)

  • Optional storeFields?: string[]

    Names of fields to store, so that search results would include them. By default none, so results would only contain the id field.

  • Optional tokenize?: ((text, fieldName?) => string[])

    Function used to split a field value into individual terms to be indexed. The default tokenizer separates terms by space or punctuation, but a custom tokenizer can be provided for custom logic.

    The function takes as arguments string to tokenize, and the name of the field it comes from. It should return the terms as an array of strings. When used for tokenizing a search query instead of a document field, the fieldName is undefined.

    Remarks

    This function is called after extractField extracts a truthy value from a field. This function is then expected to split the extracted text document into tokens (more commonly referred to as "terms" in this context). The resulting split might be simple, like for example on word boundaries, or it might be more complex, taking into account certain encoding, or parsing needs, or even just special cases. Think about how one might need to go about indexing the term "short-term". You would likely want to treat this case specially, and return two terms instead, [ "short", "term" ].

    Or, you could let such a case be handled by the processTerm function, which is designed to turn each token/term into whole terms or sub-terms. In any case, the purpose of this function is to split apart the provided text document into parts that can be processed by the processTerm function.

      • (text, fieldName?): string[]
      • Parameters

        • text: string
        • Optional fieldName: string

        Returns string[]