NLP - (Text|Token) Normalization - Case Insensitivity - lemmas

About

A normalization process remove insignificant differences between otherwise identical words to make for better searching

Would you search for déjà vu, or just for deja vu?

lemmas are normalized word forms.

Articles Related

Operation List

uppercase to lowercase
Stripping accents, diacritics and other character marking like ´, ^, and ¨. So that a search for rôle will also match role, and vice versa. It will make esta, ésta, and está all searchable as the same word

Documentation / Reference

https://www.elastic.co/guide/en/elasticsearch/guide/current/token-normalization.html