Elasticsearch是一个分布式、 RESTful风格、基于Lucene的全文搜索和分析引擎。它可以用于各种用例,例如网站搜索、日志分析、安全情报等。
由于其性能出色和易于使用的API,越来越多的企业选择使用Elasticsearch作为其搜索引擎。
一、连接Elasticsearch
连接Elasticsearch可以使用Java的TransportClient或Java High-Level REST Client两种方式。
1、TransportClient
TransportClient是Elasticsearch提供的Java API,使用它可以直接连接到Elasticsearch服务。下面是一个使用TransportClient连接Elasticsearch的示例:
TransportClient client = new PreBuiltTransportClient(Settings.EMPTY) .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));
创建TransportClient实例时需要指定连接的地址和端口号。这里我们可以使用默认的Settings和本地的9300端口。
2、Java High-Level REST Client
Java High-Level REST Client是Elasticsearch推荐的Java客户端,它使用HTTP协议与Elasticsearch进行通信。下面是一个使用Java High-Level REST Client连接Elasticsearch的示例:
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
通过HTTP Host和端口号创建RestHighLevelClient实例。这里我们使用本地的9200端口连接http。
二、索引操作
1、创建索引
创建索引需要指定索引名称和索引映射。下面是一个创建名为“my_index”的索引的示例:
CreateIndexRequest request = new CreateIndexRequest("my_index"); request.mapping("doc", "field1", "type=string", "field2", "type=integer"); CreateIndexResponse response = client.indices().create(request);
创建CreateIndexRequest对象并指定索引名称“my_index”,然后通过mapping()方法设定属性。最后通过indices()方法的create()方法创建索引。
2、添加文档
添加文档需要指定索引名称、类型和文档ID。文档格式一般为JSON,可以通过Map或实体类转换得到。下面是一个添加文档的示例:
IndexRequest request = new IndexRequest("my_index", "doc", "1"); request.source(XContentType.JSON, "field1", "value1", "field2", 2); IndexResponse response = client.index(request);
添加文档需要指定索引名称、类型和文档ID,然后通过source()方法指定文档内容。通过index()方法添加文档。
3、查询文档
查询文档需要指定查询条件和索引名称,结果一般为SearchResponse对象。下面是一个查询文档的示例:
SearchRequest searchRequest = new SearchRequest("my_index"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.query(QueryBuilders.termQuery("field1", "value1")); searchRequest.source(searchSourceBuilder); SearchResponse searchResponse = client.search(searchRequest);
查询文档需要指定查询条件和索引名称,然后通过source()方法设置查询条件。通过search()方法查询文档。
三、聚合操作
1、统计文档数量
统计文档数量可以使用count()方法。下面是一个统计所有文档数量的示例:
CountRequest countRequest = new CountRequest("my_index"); CountResponse countResponse = client.count(countRequest); long count = countResponse.getCount();
通过CountRequest对象指定索引名称,然后通过count()方法统计文档数量。
2、按条件聚合
聚合操作可以根据条件聚合文档。下面是一个按“field1”字段聚合所有文档的示例:
SearchRequest searchRequest = new SearchRequest("my_index"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.aggregation(AggregationBuilders.terms("by_field1").field("field1")); searchRequest.source(searchSourceBuilder); SearchResponse searchResponse = client.search(searchRequest); Terms terms = searchResponse.getAggregations().get("by_field1"); for (Terms.Bucket entry : terms.getBuckets()) { String key = (String) entry.getKey(); long docCount = entry.getDocCount(); }
通过AggregationBuilders对象的terms()方法指定聚合条件,结果保存在SearchResponse对象中。通过getAggregations()方法获取所有聚合结果,然后通过get()方法获取指定聚合结果,并使用for循环获取聚合结果中的所有数据。
四、数据分析
1、Term Vectors
Term Vectors可以分析文档中每个单词的词频和位置等信息。下面是一个获取Term Vectors的示例:
TermVectorsRequest request = new TermVectorsRequest("my_index", "doc", "1"); request.setFields("field1"); TermVectorsResponse response = client.termvectors(request);
通过TermVectorsRequest对象指定索引名称、类型和文档ID,然后通过Fields()方法设置需要分析的字段类型。使用termvectors()方法分析文档。
2、多关键字查询
可以使用MultiMatchQuery查询多个关键字。下面是一个查询同时匹配“field1”和“field2”的文档的示例:
SearchRequest searchRequest = new SearchRequest("my_index"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.query(QueryBuilders.multiMatchQuery("value1", "field1", "field2")); searchRequest.source(searchSourceBuilder); SearchResponse searchResponse = client.search(searchRequest);
通过QueryBuilders对象的multiMatchQuery()方法指定多个查询字段,然后使用source()方法设置查询条件。使用search()方法查询文档。
3、词条查询
词条查询可以查询特定词条的文档。下面是一个查询“field1”字段包含“value1”的文档的示例:
SearchRequest searchRequest = new SearchRequest("my_index"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.query(QueryBuilders.termQuery("field1", "value1")); searchRequest.source(searchSourceBuilder); SearchResponse searchResponse = client.search(searchRequest);
通过QueryBuilders对象的termQuery()方法指定查询字段和关键字,然后使用source()方法设置查询条件。使用search()方法查询文档。
以上是Java连接Elasticsearch的一些常用操作,本文介绍的示例代码可以参考下列完整代码:
完整代码:
TransportClient client = new PreBuiltTransportClient(Settings.EMPTY) .addTransportAddress(new InetSocketTransportAddress( InetAddress.getByName("localhost"), 9300)); //创建索引 CreateIndexRequest request = new CreateIndexRequest("my_index"); request.mapping("doc", "field1", "type=string", "field2", "type=integer"); CreateIndexResponse response = client.indices().create(request); //添加文档 IndexRequest request = new IndexRequest("my_index", "doc", "1"); request.source(XContentType.JSON, "field1", "value1", "field2", 2); IndexResponse response = client.index(request); //查询文档 SearchRequest searchRequest = new SearchRequest("my_index"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.query(QueryBuilders.termQuery("field1", "value1")); searchRequest.source(searchSourceBuilder); SearchResponse searchResponse = client.search(searchRequest); //统计文档数量 CountRequest countRequest = new CountRequest("my_index"); CountResponse countResponse = client.count(countRequest); long count = countResponse.getCount(); //聚合操作 searchSourceBuilder.aggregation(AggregationBuilders.terms("by_field1").field("field1")); searchRequest.source(searchSourceBuilder); searchResponse = client.search(searchRequest); Terms terms = searchResponse.getAggregations().get("by_field1"); for (Terms.Bucket entry : terms.getBuckets()) { String key = (String) entry.getKey(); long docCount = entry.getDocCount(); } //Term Vectors TermVectorsRequest request = new TermVectorsRequest("my_index", "doc", "1"); request.setFields("field1"); TermVectorsResponse response = client.termvectors(request); //多关键字查询 searchSourceBuilder.query(QueryBuilders.multiMatchQuery("value1", "field1", "field2")); searchRequest.source(searchSourceBuilder); searchResponse = client.search(searchRequest); //词条查询 searchSourceBuilder.query(QueryBuilders.termQuery("field1", "value1")); searchRequest.source(searchSourceBuilder); searchResponse = client.search(searchRequest);