Optional
autoDefault auto suggest options (see the SearchOptions type and the MiniSearch#autoSuggest method for details)
Optional
autoIf true
(the default), vacuuming is performed automatically as soon as
MiniSearch#discard is called a certain number of times, cleaning up
obsolete references from the index. If false
, no automatic vacuuming is
performed. Custom settings controlling auto vacuuming thresholds, as well
as batching behavior, can be passed as an object (see the AutoVacuumOptions type).
Optional
extractFunction used to extract the value of each field in documents. By default,
the documents are assumed to be plain objects with field names as keys,
but by specifying a custom extractField
function one can completely
customize how the fields are extracted.
The function takes as arguments the document, and the name of the field to extract from it. It should return the field value as a string.
The returned string is fed into the tokenize
function to split it up
into tokens.
Names of the document fields to be indexed.
Optional
idName of the ID field, uniquely identifying a document.
Optional
logger?: ((level, message, code?) => void)Function called to log messages. Arguments are a log level ('debug', 'info', 'warn', or 'error'), a log message, and an optional string code that identifies the reason for the log.
The default implementation uses console
, if defined.
Optional
code: stringOptional
processFunction used to process a term before indexing or search. This can be used for normalization (such as stemming). By default, terms are downcased, and otherwise no other normalization is performed.
The function takes as arguments a term to process, and the name of the field it comes from. It should return the processed term as a string, or a falsy value to reject the term entirely.
It can also return an array of strings, in which case each string in the returned array is indexed as a separate term.
During the document indexing phase, the first step is to call the extractField
function to fetch the requested value/field from the document. This is then
passed off to the tokenizer
, which will break apart each value into "terms".
These terms are then individually passed through the processTerm
function
to compute each term individually. A term might for example be something
like "lbs", in which case one would likely want to return
[ "lbs", "lb", "pound", "pounds" ]
. You may also return a single string value,
or a falsy value if you would like to skip indexing entirely for a specific term.
Truthy return value(s) are then fed to the indexer as positive matches for this
document. In our example above, all four of the [ "lbs", "lb", "pound", "pounds" ]
terms would be added to the indexing engine, matching against the current document
being computed.
Note: Whatever values are returned from this function will receive no further processing before being indexed. This means for example, if you include whitespace at the beginning or end of a word, it will also be indexed that way, with the included whitespace.
Optional
fieldName: stringOptional
searchDefault search options (see the SearchOptions type and the MiniSearch#search method for details)
Optional
storeNames of fields to store, so that search results would include them. By default none, so results would only contain the id field.
Optional
tokenize?: ((text, fieldName?) => string[])Function used to split a field value into individual terms to be indexed. The default tokenizer separates terms by space or punctuation, but a custom tokenizer can be provided for custom logic.
The function takes as arguments string to tokenize, and the name of the
field it comes from. It should return the terms as an array of strings.
When used for tokenizing a search query instead of a document field, the
fieldName
is undefined.
This function is called after extractField
extracts a truthy value from a
field. This function is then expected to split the extracted text
document
into tokens (more commonly referred to as "terms" in this context). The resulting
split might be simple, like for example on word boundaries, or it might be more
complex, taking into account certain encoding, or parsing needs, or even just
special cases. Think about how one might need to go about indexing the term
"short-term". You would likely want to treat this case specially, and return two
terms instead, [ "short", "term" ]
.
Or, you could let such a case be handled by the processTerm
function,
which is designed to turn each token/term into whole terms or sub-terms. In any
case, the purpose of this function is to split apart the provided text
document
into parts that can be processed by the processTerm
function.
Optional
fieldName: string
Configuration options passed to the MiniSearch constructor