Type alias SearchOptions

SearchOptions: {
    bm25?: BM25Params;
    boost?: {
        [fieldName: string]: number;
    };
    boostDocument?: ((documentId, term, storedFields?) => number);
    boostTerm?: ((term, i, terms) => number);
    combineWith?: CombinationOperator;
    fields?: string[];
    filter?: ((result) => boolean);
    fuzzy?: boolean | number | ((term, index, terms) => boolean | number);
    maxFuzzy?: number;
    prefix?: boolean | ((term, index, terms) => boolean);
    processTerm?: ((term) => string | string[] | null | undefined | false);
    tokenize?: ((text) => string[]);
    weights?: {
        fuzzy: number;
        prefix: number;
    };
}

Search options to customize the search behavior.

Type declaration

  • Optional bm25?: BM25Params

    BM25+ algorithm parameters. Customizing these is almost never necessary, and finetuning them requires an understanding of the BM25 scoring model. In most cases, it is best to omit this option to use defaults, and instead use boosting to tweak scoring for specific use cases.

  • Optional boost?: {
        [fieldName: string]: number;
    }

    Key-value object of field names to boosting values. By default, fields are assigned a boosting factor of 1. If one assigns to a field a boosting value of 2, a result that matches the query in that field is assigned a score twice as high as a result matching the query in another field, all else being equal.

    • [fieldName: string]: number
  • Optional boostDocument?: ((documentId, term, storedFields?) => number)

    Function to calculate a boost factor for documents. It takes as arguments the document ID, and a term that matches the search in that document, and the value of the stored fields for the document (if any). It should return a boosting factor: a number higher than 1 increases the computed score, a number lower than 1 decreases the score, and a falsy value skips the search result completely.

      • (documentId, term, storedFields?): number
      • Parameters

        • documentId: any
        • term: string
        • Optional storedFields: Record<string, unknown>

        Returns number

  • Optional boostTerm?: ((term, i, terms) => number)

    Function to calculate a boost factor for each term.

    This function, if provided, is called for each query term (as split by tokenize and processed by processTerm). The arguments passed to the function are the query term, the positional index of the term in the query, and the array of all query terms. It is expected to return a numeric boost factor for the term. A factor lower than 1 reduces the importance of the term, a factor greater than 1 increases it. A factor of exactly 1 is neutral, and does not affect the term's importance.

      • (term, i, terms): number
      • Parameters

        • term: string
        • i: number
        • terms: string[]

        Returns number

  • Optional combineWith?: CombinationOperator

    The operand to combine partial results for each term. By default it is "OR", so results matching any of the search terms are returned by a search. If "AND" is given, only results matching all the search terms are returned by a search.

  • Optional fields?: string[]

    Names of the fields to search in. If omitted, all fields are searched.

  • Optional filter?: ((result) => boolean)

    Function used to filter search results, for example on the basis of stored fields. It takes as argument each search result and should return a boolean to indicate if the result should be kept or not.

      • (result): boolean
      • Parameters

        Returns boolean

  • Optional fuzzy?: boolean | number | ((term, index, terms) => boolean | number)

    Controls whether to perform fuzzy search. It can be a simple boolean, or a number, or a function.

    If a boolean is given, fuzzy search with a default fuzziness parameter is performed if true.

    If a number higher or equal to 1 is given, fuzzy search is performed, with a maximum edit distance (Levenshtein) equal to the number.

    If a number between 0 and 1 is given, fuzzy search is performed within a maximum edit distance corresponding to that fraction of the term length, approximated to the nearest integer. For example, 0.2 would mean an edit distance of 20% of the term length, so 1 character in a 5-characters term. The calculated fuzziness value is limited by the maxFuzzy option, to prevent slowdown for very long queries.

    If a function is passed, the function is called upon search with a search term, a positional index of that term in the tokenized search query, and the tokenized search query. It should return a boolean or a number, with the meaning documented above.

  • Optional maxFuzzy?: number

    Controls the maximum fuzziness when using a fractional fuzzy value. This is set to 6 by default. Very high edit distances usually don't produce meaningful results, but can excessively impact search performance.

  • Optional prefix?: boolean | ((term, index, terms) => boolean)

    Controls whether to perform prefix search. It can be a simple boolean, or a function.

    If a boolean is passed, prefix search is performed if true.

    If a function is passed, it is called upon search with a search term, the positional index of that search term in the tokenized search query, and the tokenized search query. The function should return a boolean to indicate whether to perform prefix search for that search term.

  • Optional processTerm?: ((term) => string | string[] | null | undefined | false)

    Function to process or normalize terms in the search query. By default, the same term processor used for indexing is used also for search.

    Remarks

    During the document indexing phase, the first step is to call the extractField function to fetch the requested value/field from the document. This is then passed off to the tokenize function, which will break apart each value into "terms". These terms are then individually passed through this function to compute each term individually. A term might for example be something like "lbs", in which case one would likely want to return [ "lbs", "lb", "pound", "pounds" ]. You may also return just a single string, or a falsy value if you would like to skip indexing entirely for a specific term.

    Truthy return value(s) are then fed to the indexer as positive matches for this document. In our example above, all four of the [ "lbs", "lb", "pound", "pounds" ] terms would be added to the indexing engine, matching against the current document being computed.

    Note: Whatever values are returned from this function will receive no further processing before being indexed. This means for example, if you include whitespace at the beginning or end of a word, it will also be indexed that way, with the included whitespace.

      • (term): string | string[] | null | undefined | false
      • Parameters

        • term: string

        Returns string | string[] | null | undefined | false

  • Optional tokenize?: ((text) => string[])

    Function to tokenize the search query. By default, the same tokenizer used for indexing is used also for search.

    Remarks

    This function is called after extractField extracts a truthy value from a field. This function is then expected to split the extracted text document into tokens (more commonly referred to as "terms" in this context). The resulting split might be simple, like for example on word boundaries, or it might be more complex, taking into account certain encoding, or parsing needs, or even just special cases. Think about how one might need to go about indexing the term "short-term". You would likely want to treat this case specially, and return two terms instead, [ "short", "term" ].

    Or, you could let such a case be handled by the processTerm function, which is designed to turn each token/term into whole terms or sub-terms. In any case, the purpose of this function is to split apart the provided text document into parts that can be processed by the processTerm function.

      • (text): string[]
      • Parameters

        • text: string

        Returns string[]

  • Optional weights?: {
        fuzzy: number;
        prefix: number;
    }

    Relative weights to assign to prefix search results and fuzzy search results. Exact matches are assigned a weight of 1.

    • fuzzy: number
    • prefix: number