Optional
bm25?: BM25ParamsBM25+ algorithm parameters. Customizing these is almost never necessary, and finetuning them requires an understanding of the BM25 scoring model. In most cases, it is best to omit this option to use defaults, and instead use boosting to tweak scoring for specific use cases.
Optional
boost?: { Key-value object of field names to boosting values. By default, fields are assigned a boosting factor of 1. If one assigns to a field a boosting value of 2, a result that matches the query in that field is assigned a score twice as high as a result matching the query in another field, all else being equal.
Optional
boostFunction to calculate a boost factor for documents. It takes as arguments the document ID, and a term that matches the search in that document, and the value of the stored fields for the document (if any). It should return a boosting factor: a number higher than 1 increases the computed score, a number lower than 1 decreases the score, and a falsy value skips the search result completely.
Optional
storedFields: Record<string, unknown>Optional
boostFunction to calculate a boost factor for each term.
This function, if provided, is called for each query term (as split by
tokenize
and processed by processTerm
). The arguments passed to the
function are the query term, the positional index of the term in the query,
and the array of all query terms. It is expected to return a numeric boost
factor for the term. A factor lower than 1 reduces the importance of the
term, a factor greater than 1 increases it. A factor of exactly 1 is
neutral, and does not affect the term's importance.
Optional
combineThe operand to combine partial results for each term. By default it is "OR", so results matching any of the search terms are returned by a search. If "AND" is given, only results matching all the search terms are returned by a search.
Optional
fields?: string[]Names of the fields to search in. If omitted, all fields are searched.
Optional
filter?: ((result) => boolean)Function used to filter search results, for example on the basis of stored fields. It takes as argument each search result and should return a boolean to indicate if the result should be kept or not.
Optional
fuzzy?: boolean | number | ((term, index, terms) => boolean | number)Controls whether to perform fuzzy search. It can be a simple boolean, or a number, or a function.
If a boolean is given, fuzzy search with a default fuzziness parameter is performed if true.
If a number higher or equal to 1 is given, fuzzy search is performed, with a maximum edit distance (Levenshtein) equal to the number.
If a number between 0 and 1 is given, fuzzy search is performed within a
maximum edit distance corresponding to that fraction of the term length,
approximated to the nearest integer. For example, 0.2 would mean an edit
distance of 20% of the term length, so 1 character in a 5-characters term.
The calculated fuzziness value is limited by the maxFuzzy
option, to
prevent slowdown for very long queries.
If a function is passed, the function is called upon search with a search term, a positional index of that term in the tokenized search query, and the tokenized search query. It should return a boolean or a number, with the meaning documented above.
Optional
maxControls the maximum fuzziness when using a fractional fuzzy value. This is set to 6 by default. Very high edit distances usually don't produce meaningful results, but can excessively impact search performance.
Optional
prefix?: boolean | ((term, index, terms) => boolean)Controls whether to perform prefix search. It can be a simple boolean, or a function.
If a boolean is passed, prefix search is performed if true.
If a function is passed, it is called upon search with a search term, the positional index of that search term in the tokenized search query, and the tokenized search query. The function should return a boolean to indicate whether to perform prefix search for that search term.
Optional
processFunction to process or normalize terms in the search query. By default, the same term processor used for indexing is used also for search.
During the document indexing phase, the first step is to call the
extractField
function to fetch the requested value/field from the
document. This is then passed off to the tokenize
function, which will
break apart each value into "terms". These terms are then individually
passed through this function to compute each term individually. A term
might for example be something like "lbs", in which case one would likely
want to return [ "lbs", "lb", "pound", "pounds" ]
. You may also return
just a single string, or a falsy value if you would like to skip indexing
entirely for a specific term.
Truthy return value(s) are then fed to the indexer as positive matches for
this document. In our example above, all four of the [ "lbs", "lb", "pound", "pounds" ]
terms would be added to the indexing engine, matching
against the current document being computed.
Note: Whatever values are returned from this function will receive no further processing before being indexed. This means for example, if you include whitespace at the beginning or end of a word, it will also be indexed that way, with the included whitespace.
Optional
tokenize?: ((text) => string[])Function to tokenize the search query. By default, the same tokenizer used for indexing is used also for search.
This function is called after extractField
extracts a truthy
value from a field. This function is then expected to split the extracted
text
document into tokens (more commonly referred to as "terms" in this
context). The resulting split might be simple, like for example on word
boundaries, or it might be more complex, taking into account certain
encoding, or parsing needs, or even just special cases. Think about how one
might need to go about indexing the term "short-term". You would likely
want to treat this case specially, and return two terms instead, [ "short", "term" ]
.
Or, you could let such a case be handled by the processTerm
function,
which is designed to turn each token/term into whole terms or sub-terms. In
any case, the purpose of this function is to split apart the provided
text
document into parts that can be processed by the processTerm
function.
Optional
weights?: { Relative weights to assign to prefix search results and fuzzy search results. Exact matches are assigned a weight of 1.
Search options to customize the search behavior.