Apache Lucene(TM) is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
For adding customized full text search,Lucene is powerful efficient search algo
lets begin to explore it,The Example below is very self explanatory
Step 1: Create java project in eclipse add lucene core jar into the build path of
the project.
Step 2: Create a class LuceneIndexnSearch as shown below
In the source folder I have copied the to documents(text files) vehicleOwnedByABC.txt the content of the file is as:
Maruti
mahindra
vento
honda city
honda accord
hyundai
The other file name is vehicleOwnedByDEF.txt with content
swaraj
renault
polo
nissan
maruti
hyundai
Step 4: Create your main class to test the searching and indexing as fallows
Step 5: Here is the output at console:
For adding customized full text search,Lucene is powerful efficient search algo
lets begin to explore it,The Example below is very self explanatory
Step 1: Create java project in eclipse add lucene core jar into the build path of
the project.
Step 2: Create a class LuceneIndexnSearch as shown below
package com.test;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocCollector;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.LockObtainFailedException;
@SuppressWarnings("deprecation")
public class LuceneIndexnSearch {
public static final String SOURCE_FILE = "sourceFileToIndex";
public static final String INDEX_DIR = "indexDir";
public static final String FIELD_PATH = "path";
public static final String FIELD_CONTENTS = "contents";
public void createIndex() throws CorruptIndexException, LockObtainFailedException, IOException {
Analyzer analyzer = new StandardAnalyzer();
boolean recreateIndexIfExists = true;
IndexWriter indexWriter = new IndexWriter(INDEX_DIR, analyzer, recreateIndexIfExists);
File dir = new File(SOURCE_FILE);
File[] files = dir.listFiles();
for (File file : files) {
Document document = new Document();
String path = file.getCanonicalPath();
document.add(new Field(FIELD_PATH, path, Field.Store.YES, Field.Index.UN_TOKENIZED));
Reader reader = new FileReader(file);
document.add(new Field(FIELD_CONTENTS, reader));
indexWriter.addDocument(document);
}
indexWriter.optimize();
indexWriter.close();
}
public static void searchIndex(String searchString) throws IOException, ParseException {
System.out.println("Searching for '" + searchString + "'");
Directory directory = FSDirectory.getDirectory(INDEX_DIR);
IndexReader indexReader = IndexReader.open(directory);
IndexSearcher indexSearcher = new IndexSearcher(indexReader);
Analyzer analyzer = new StandardAnalyzer();
QueryParser queryParser = new QueryParser(FIELD_CONTENTS, analyzer);
Query query = queryParser.parse(searchString);
TopDocCollector collector = new TopDocCollector(5);
indexSearcher.search(query, collector);
int numTotalHits = collector.getTotalHits();
collector = new TopDocCollector(numTotalHits);
indexSearcher.search(query, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
for(ScoreDoc sd : hits){
int docId = sd.doc;
Document document = indexSearcher.doc(docId);
System.out.println("Number of matches(Hits) in the document "+document.get(FIELD_PATH)+" of the given string "+searchString+" is "+sd.doc);
}
}
}
Step 3: create to folder to contain files to index and to contain the indexed document:In the source folder I have copied the to documents(text files) vehicleOwnedByABC.txt the content of the file is as:
Maruti
mahindra
vento
honda city
honda accord
hyundai
The other file name is vehicleOwnedByDEF.txt with content
swaraj
renault
polo
nissan
maruti
hyundai
Step 4: Create your main class to test the searching and indexing as fallows
package com.test;
public class Main {
public static void main(String[] args) {
LuceneIndexnSearch lins = new LuceneIndexnSearch();
try{
lins.createIndex();
lins.searchIndex("Hyundai");
lins.searchIndex("Maruti");
lins.searchIndex("Mahindra");
lins.searchIndex("Honda city");
lins.searchIndex("Honda accord");
}catch(Exception e){
e.printStackTrace();
}
}
}
Step 5: Here is the output at console:
Searching for 'Hyundai' Number of matches(Hits) in the document F:\SpringExamples\LuceneExample\sourceFileToIndex\vehicleOwnedByDEF.txt of the given string Hyundai is 1 Number of matches(Hits) in the document F:\SpringExamples\LuceneExample\sourceFileToIndex\vehicleOwnedByABC.txt of the given string Hyundai is 0 Searching for 'Maruti' Number of matches(Hits) in the document F:\SpringExamples\LuceneExample\sourceFileToIndex\vehicleOwnedByDEF.txt of the given string Maruti is 1 Number of matches(Hits) in the document F:\SpringExamples\LuceneExample\sourceFileToIndex\vehicleOwnedByABC.txt of the given string Maruti is 0 Searching for 'Mahindra' Number of matches(Hits) in the document F:\SpringExamples\LuceneExample\sourceFileToIndex\vehicleOwnedByABC.txt of the given string Mahindra is 0 Searching for 'Honda city' Number of matches(Hits) in the document F:\SpringExamples\LuceneExample\sourceFileToIndex\vehicleOwnedByABC.txt of the given string Honda city is 0 Searching for 'Honda accord' Number of matches(Hits) in the document F:\SpringExamples\LuceneExample\sourceFileToIndex\vehicleOwnedByABC.txt of the given string Honda accord is 0
1 comment:
These kind of articles are always attractive and I am happy to find so many good point here in the post writing is simply great thanks for sharing.
Outdoor Furniture
Post a Comment