My Learnings: 2018

Elastic Search uses inverted Index data structure

Inverted Index Data Structure representation of sample documents
Doc1: This is first sample document
Doc2: Second document for the Inverted Index
Doc3: Final sample document

Inverted Index of the above three documents

dictionary referred Documents
term frequency

this 1 1
is 1 1
first 1 1
sample 2 1, 3
document 3 1,2,3
second 1 2
for 1 2
the 1 2
inverted 1 2
index 1 2
final 1 3

Index :
lists terms in specific document

Some Advantages of inverted Index :
getting the list of all document that contains the given term or terms
AND and OR of the terms
prefix based searching
suffix based searching ( reverse the terms , reverse the search term , search by prefix
example: original term : fantastic , search suffix : astic then
reverse term : citsatnaf, reverse search suffix : citsa , now do prefix based search , I.e all terms ( reverse terms ) started with reverse search suffix )

finding substrings ( by splitting the terms in n-grams and search for strings )
Numbers searching e.g. between 100 to 199 ( Lucene stores 123 as "1"-hundreds,"2"-tens and "3", so searching for 100 to 199 will get all terms with prefix "1"-hundreds and it will avoid getting others numbers like 1234 )

References :
https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up

My Learnings

Thursday, June 21, 2018

Elastic Search - blog 1 ( inverted Index )

Followers

Blog Archive

About Me