您的位置:

Java连接Elasticsearch实战指南

Elasticsearch是一个分布式、 RESTful风格、基于Lucene的全文搜索和分析引擎。它可以用于各种用例,例如网站搜索、日志分析、安全情报等。

由于其性能出色和易于使用的API,越来越多的企业选择使用Elasticsearch作为其搜索引擎。

一、连接Elasticsearch

连接Elasticsearch可以使用Java的TransportClient或Java High-Level REST Client两种方式。

1、TransportClient

TransportClient是Elasticsearch提供的Java API,使用它可以直接连接到Elasticsearch服务。下面是一个使用TransportClient连接Elasticsearch的示例:

TransportClient client = new PreBuiltTransportClient(Settings.EMPTY)
        .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));

创建TransportClient实例时需要指定连接的地址和端口号。这里我们可以使用默认的Settings和本地的9300端口。

2、Java High-Level REST Client

Java High-Level REST Client是Elasticsearch推荐的Java客户端,它使用HTTP协议与Elasticsearch进行通信。下面是一个使用Java High-Level REST Client连接Elasticsearch的示例:

RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));

通过HTTP Host和端口号创建RestHighLevelClient实例。这里我们使用本地的9200端口连接http。

二、索引操作

1、创建索引

创建索引需要指定索引名称和索引映射。下面是一个创建名为“my_index”的索引的示例:

CreateIndexRequest request = new CreateIndexRequest("my_index");
request.mapping("doc", "field1", "type=string", "field2", "type=integer");
CreateIndexResponse response = client.indices().create(request);

创建CreateIndexRequest对象并指定索引名称“my_index”,然后通过mapping()方法设定属性。最后通过indices()方法的create()方法创建索引。

2、添加文档

添加文档需要指定索引名称、类型和文档ID。文档格式一般为JSON,可以通过Map或实体类转换得到。下面是一个添加文档的示例:

IndexRequest request = new IndexRequest("my_index", "doc", "1");
request.source(XContentType.JSON, "field1", "value1", "field2", 2);
IndexResponse response = client.index(request);

添加文档需要指定索引名称、类型和文档ID,然后通过source()方法指定文档内容。通过index()方法添加文档。

3、查询文档

查询文档需要指定查询条件和索引名称,结果一般为SearchResponse对象。下面是一个查询文档的示例:

SearchRequest searchRequest = new SearchRequest("my_index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.termQuery("field1", "value1"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest);

查询文档需要指定查询条件和索引名称,然后通过source()方法设置查询条件。通过search()方法查询文档。

三、聚合操作

1、统计文档数量

统计文档数量可以使用count()方法。下面是一个统计所有文档数量的示例:

CountRequest countRequest = new CountRequest("my_index");
CountResponse countResponse = client.count(countRequest);
long count = countResponse.getCount();

通过CountRequest对象指定索引名称,然后通过count()方法统计文档数量。

2、按条件聚合

聚合操作可以根据条件聚合文档。下面是一个按“field1”字段聚合所有文档的示例:

SearchRequest searchRequest = new SearchRequest("my_index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.aggregation(AggregationBuilders.terms("by_field1").field("field1"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest);
Terms terms = searchResponse.getAggregations().get("by_field1");
for (Terms.Bucket entry : terms.getBuckets()) {
    String key = (String) entry.getKey();
    long docCount = entry.getDocCount();
}

通过AggregationBuilders对象的terms()方法指定聚合条件,结果保存在SearchResponse对象中。通过getAggregations()方法获取所有聚合结果,然后通过get()方法获取指定聚合结果,并使用for循环获取聚合结果中的所有数据。

四、数据分析

1、Term Vectors

Term Vectors可以分析文档中每个单词的词频和位置等信息。下面是一个获取Term Vectors的示例:

TermVectorsRequest request = new TermVectorsRequest("my_index", "doc", "1");
request.setFields("field1");
TermVectorsResponse response = client.termvectors(request);

通过TermVectorsRequest对象指定索引名称、类型和文档ID,然后通过Fields()方法设置需要分析的字段类型。使用termvectors()方法分析文档。

2、多关键字查询

可以使用MultiMatchQuery查询多个关键字。下面是一个查询同时匹配“field1”和“field2”的文档的示例:

SearchRequest searchRequest = new SearchRequest("my_index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.multiMatchQuery("value1", "field1", "field2"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest);

通过QueryBuilders对象的multiMatchQuery()方法指定多个查询字段,然后使用source()方法设置查询条件。使用search()方法查询文档。

3、词条查询

词条查询可以查询特定词条的文档。下面是一个查询“field1”字段包含“value1”的文档的示例:

SearchRequest searchRequest = new SearchRequest("my_index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.termQuery("field1", "value1"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest);

通过QueryBuilders对象的termQuery()方法指定查询字段和关键字,然后使用source()方法设置查询条件。使用search()方法查询文档。

以上是Java连接Elasticsearch的一些常用操作,本文介绍的示例代码可以参考下列完整代码:

完整代码:

TransportClient client = new PreBuiltTransportClient(Settings.EMPTY)
        .addTransportAddress(new InetSocketTransportAddress(
                InetAddress.getByName("localhost"), 9300));

//创建索引
CreateIndexRequest request = new CreateIndexRequest("my_index");
request.mapping("doc", "field1", "type=string", "field2", "type=integer");
CreateIndexResponse response = client.indices().create(request);

//添加文档
IndexRequest request = new IndexRequest("my_index", "doc", "1");
request.source(XContentType.JSON, "field1", "value1", "field2", 2);
IndexResponse response = client.index(request);

//查询文档
SearchRequest searchRequest = new SearchRequest("my_index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.termQuery("field1", "value1"));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest);

//统计文档数量
CountRequest countRequest = new CountRequest("my_index");
CountResponse countResponse = client.count(countRequest);
long count = countResponse.getCount();

//聚合操作
searchSourceBuilder.aggregation(AggregationBuilders.terms("by_field1").field("field1"));
searchRequest.source(searchSourceBuilder);
searchResponse = client.search(searchRequest);
Terms terms = searchResponse.getAggregations().get("by_field1");
for (Terms.Bucket entry : terms.getBuckets()) {
    String key = (String) entry.getKey();
    long docCount = entry.getDocCount();
}

//Term Vectors
TermVectorsRequest request = new TermVectorsRequest("my_index", "doc", "1");
request.setFields("field1");
TermVectorsResponse response = client.termvectors(request);

//多关键字查询
searchSourceBuilder.query(QueryBuilders.multiMatchQuery("value1", "field1", "field2"));
searchRequest.source(searchSourceBuilder);
searchResponse = client.search(searchRequest);

//词条查询
searchSourceBuilder.query(QueryBuilders.termQuery("field1", "value1"));
searchRequest.source(searchSourceBuilder);
searchResponse = client.search(searchRequest);