1 Star 0 Fork 1

有理想的鸭子 / esqlj-elastic-jdbc

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

esqlj-elastic-jdbc

一个在Apache License 2.0下发布的Elasticsearch开源JDBC驱动

esqlj是在Elastic Rest High Level API(版本7.11)之上构建的针对Elastic的JDBC驱动。有关Elastic许可证,请参阅 licenses/elastic-licenses文件夹。

esqlj扩展了SQL语法,提供了全文查询、地理查询、形状查询、连接查询等高级Elastic查询功能。

Sql解析由jsqlparser库 JSQLParser提供。 相关许可证请参阅 licenses/JSqlParser文件夹。

关于SQL实现,请参见下文“支持矩阵与约定”部分。

项目状态

尚未达到生产可用级别

已实现DQL DDL和DML实现部分(update, insert, delete, drop)

驱动类

org.fpasti.jdbc.esqlj.EsDriver

JDBC连接字符串

JDBC URL必须遵循以下语法:

jdbc:esqlj:http<s>://<elastic_address_1>:<elastic_port_1>,http://<elastic_address_2>:<elastic_port_2>,...;param1=paramValue1;...

可以通过逗号分隔的URL列表声明一组连接池。

可选参数:

参数名 描述 默认值
userName 用户名 -
password 密码 -
includeTextFieldsByDefault 是否默认包含Elastic text 类型字段在 select * false
indexMetaDataCache 缓存检索到的索引结构(建议保持此功能启用,因为获取这些信息可能是一项耗时的操作,特别是对于别名或星标索引查询)。对不可变索引来说,最好启用该选项 true
maxGroupByRetrievedElements 对选定字段进行GROUP BY查询时的最大检索元素数 500
queryScrollFromRows 第一次分页时获取的行数 500
queryScrollFetchSize 下一次分页时获取的行数 500
queryScrollTimeoutMinutes 分页之间超时时间,以分钟为单位 3
queryScrollOnlyByScrollApi 如果为true,则分页将通过Elastic Scroll API执行。如果为false,将根据查询情况应用最适合的滚动策略(请参阅下方分页段落) true
sharedConnection 如果设置为 true ,esqlj内部使用的Elastic客户端将在所有连接间静态共享(如果您在同一JVM内没有连接到不同Elastic集群的需求时使用此项) true

概念

Elasticsearch的索引被当作SQL表来管理。 Elasticsearch的别名则被当作SQL视图来管理。

包含特殊字符 如 *, -, . 的索引进行查询时,需要使用双引号包裹。 例如: 'SELECT * FROM ".test-index*"'
包含特殊字符 - 的字段和别名也必须使用双引号包裹。

Elasticsearch文档标识符"_id"可以在非聚合查询中获取,其类型为 string 并像索引主键一样映射到元数据上。此列也可以在Where条件中用于匹配查询(=、!=)。 搜索评分"_score"在非聚合查询中作为float类型的列返回。

标准SQL过滤语法的功能非常有限。esqlj支持使用Elastic API的全文查询、地理查询、形状查询等自定义语法对文档进行过滤。 目前仅实现了一部分这些高级过滤技术。以下是一个使用 Query string ELastic API (query-dsl-query-string-query)的例子:

SELECT _id, _score FROM indexName WHERE query_string('(new york city) OR (big apple) OR name:/joh?n(ath[oa]n)/', 'field1, field2,city.*', 'minimum_should_match:2')

默认情况下,可检索的最大文档字段/列数被设置为100。
例如,这就解释了为什么默认情况下包含近500个字段的.kibana_*索引在执行select *时会返回错误。 若要根据您的需求增加此配置阈值,请按照需要更改Elastic设置:'index.max_docvalue_fields_search'

针对以'my-index'开头的索引更改最大文档阈值:

PUT /my-index*/_settings
{
  "index" : {
    "max_docvalue_fields_search" : 500
  }
}

更改所有索引的最大文档阈值:

PUT /*/_settings
{
  "index" : {
    "max_docvalue_fields_search" : 500
  }
}

在未来将不再有可能查询系统索引。

Elastic boolean fields are typed BOOLEAN on resultset. Use constants true and false to express conditions on these fields. Example: SELECT * from \"esqlj-test-static-010\" WHERE booleanField=true

DBeaver

A sample usage of esqlj in DBeaver:

DBeaver navigator panel
Tables are Elasticsearch indices. Views are Elasticsearch aliases

DBeaver navigator panel
Index fields

DBeaver navigator panel
Index documents

DBeaver navigator panel Sample SQL query

如何配置DBeaver使用esqlj驱动程序 (without Elastic login)

  • Create a new connection of type Elasticsearch
  • Click "Edit Driver Settings"
  • Change:
    • Class Name: org.fpasti.jdbc.esqlj.EsDriver
    • URL Template: jdbc:esqlj:http://{host}:{port}
    • Remove all jars and add esqlj-<rel>-shaded.jar (available from project build)
    • Click "OK" to confirm
  • Change if required host and port and Test Connection
  • OK

Sample usage from Java

Add driver dependency in pom.xml:

<dependency>
	<groupId>org.fpasti</groupId>
	<artifactId>esqlj</artifactId>
	<version>0.2.0</version>
</dependency>
DriverManager.registerDriver(new EsDriver());
Connection connection = DriverManager.getConnection("jdbc:esqlj:http://localhost:9200");
Statement stmt = null;
ResultSet rs = null;

try {
	stmt = connection.createStatement();
	rs = stmt.executeQuery("SELECT * from \"esqlj-test-static-010\" WHERE booleanField=true");

	// print out column & fields
	ResultSetMetaData rsm = rs.getMetaData();
	for(int i = 1; i <= rs.getMetaData().getColumnCount(); i++) {
		System.out.println(String.format("%d: Column: %s, Column Alias: %s, Type: %s", i, rsm.getColumnName(i), rsm.getColumnLabel(i), rsm.getColumnTypeName(i)));
	}

	// iterate over query res
	while (rs.next()) {
		System.out.println(String.format("_id: %s : doubleField: %f - keywordField: %s - textField: %s - score: %f", rs.getString(10), rs.getDouble(2), rs.getObject(5), rs.getString(8), rs.getFloat(11)));
	}

} catch (SQLException ex) {
	System.out.println("SQLException: " + ex.getMessage());
} finally {
	if(stmt != null) {
		stmt.close();
	}
	if(connection != null) {
		connection.close();
	}
}

PreparedStatement actually unimplemented

Types

Mapping of supported Elastic types to SQL types:

Elastic on index type Metadata declared SQL Type Java effective type
boolean BOOL Boolean
date TIMESTAMP LocalDateTime
date_nanos TIMESTAMP LocalDateTime
doc_id VARCHAR String
double NUMBER Double
flattened STRUCT Object
float NUMBER Float
geo_point STRUCT EsGeoPoint
half_float NUMBER Float
integer NUMBER Integer
ip VARCHAR String
keyword VARCHAR String
long BIGINT Long
object STRUCT Object
scaled_float NUMBER Float
short NUMBER Byte
text VARCHAR String
unsigned_long NUMBER Long
wildcard VARCHAR String

分页

默认情况下,esqlj通过Elastic Scroll API在查询上实现滚动策略。可选地,可以激活按顺序进行的更节省资源的滚动方式,但如果您想启用此功能,请注意在每个查询中至少包含一个tiebreaker字段(将来可能不再支持通过文档ID查询,将标识符存储在文档字段中可能成为一个最佳实践)。目前关于在查询结果中引入自动tiebreaker的RFC正在Elastic产品讨论中。但现在如果您启用了此功能而未添加对tiebreaker字段的排序,数据分页之间可能会跳过一些行。

关于按顺序滚动的话题,驱动程序并未使用Point in Time API(似乎Elastic Rest High level API中缺少对此的支持)。

请注意:Scroll API会在服务器端消耗资源。最佳实践是尽快获取所有所需数据。数据检索完成后,esqlj会自动释放滚动链接。

测试

大部分测试单元需要一个实时的Elastic实例。 这些单元的激活由名为“ESQLJ_TEST_CONFIG”的系统变量控制。 环境变量必须连接有效的esqlj JDBC URL连接字符串以及单元内部查询所需文档的加载策略:

ESQLJ_TEST_CONFIG=jdbc:esqlj:http://<elastic_address>:<elastic_port>|<createAndDestroy or createOnly>
Parameters Actions Scope
createAndDestroy Test units create index 'esqlj-test-volatile-<uuid>' on start and delete it on finish Continuous Delivery/Deployment
createOnly Test units create index 'esqlj-test-static-<release.version>' and not delete it on finish. If it's just present on Elasticsearch it will be preserved. (Will be required a manual delete of it from system). Development stage

Sample configuration: ESQLJ_TEST_CONFIG="jdbc:esqlj:http://10.77.154.32:9080|createOnly"

If ESQLJ_TEST_CONFIG isn't declared, all tests depending from live connection will be skipped.

Support matrix and conventions

Select clause

Actually supported SELECT [...] elements:

Select element Notes
column Elastic document field
column alias or column AS alias Alias for field in query result
* All document fields
_id document identifier (string)
_score document query search score (float)
TO_CHAR(field, mask_date) Format date field. Example: TO_CHAR(timestampField, 'YYYY/MM/DD HH:MI:SS'). Supported mask: YEAR, YYYY, YY, MM, MONTH, MON, DDD, DD, HH24, HH12, HH, MI, SS, DAY, XFF, FFF, FF, F, PM, TZR, TZH.
LATITUDE Extract latitude from EsGeoPoint field (not orderable)
LONGITUDE Extract longitude from EsGeoPoint field (not orderable)
COUNT(*) Number of documents in index: SELECT COUNT(*) FROM index
COUNT(field) Number of documents in index where specified field is present and not null: SELECT COUNT("object.keywordObjectField") FROM index
COUNT(DISTINCT field) Number of distinct values of specified field in index: SELECT COUNT(DISTINCT keywordField) FROM index

Supported GROUP BY query functions:

Function Description Example
AVG Average of values SELECT AVG(integerField) FROM index
COUNT(*) Number of documents Number of documents in group: SELECT COUNT(*) FROM index
COUNT(field) Number of documents with specified field Number of documents in group where specified field is present and not null: SELECT COUNT("object.keywordObjectField") FROM index
COUNT(DISTINCT field) Number of distinct values Number of distinct values of specified field: SELECT COUNT(DISTINCT keywordField) FROM index
MAX Max column value SELECT MAX(integerField) FROM index
MIN Min column value SELECT MIN(integerField) FROM index
SUM Sum of values SELECT SUM(integerField) FROM index

Select Distinct

It's possibile to query distinct values using DISTINCT clause.

Example:SELECT DISTINCT keywordField, booleanField FROM index ORDER BY keywordField, booleanField DESC

Where condition

You can use both column name or column alias in expression.

Expression condition Notes
left expression = value
left expression != value
left expression > numeric_value
left expression >= numeric_value
left expression < numeric_value
left expression <= numeric_value
left expression LIKE expression Implemented by Wildcard Elasticsearch filter. See Elasticsearch documentation about its usage (query-dsl-wildcard-query)
left expression IS NULL
left expression IS NOT NULL
left expression BETWEEN a AND b a and b could be NUMBER, STRING, date expressed by TO_DATE('date', 'mask_date'), EXTRACT function
left expression IN (value1, value2, ...)
query_string(...) Elastic raw query. See below for reference
geo_bounding_box(...) Elastic raw query. See below for reference

Admitted left expression

Expression
column
alias
EXTRACT(period from column)

value=column expression is for example considered invalid from esqlj

Specific Elastic query functions

esqlj allow you to use specific Elastic query API.
Syntax usage is query_type(param1,param2,...), where query_type maps specific Elastic query; and param1,param2,... allows you to pass parameters to that query.
There are a set of mandatory parameters for every implemented custom query. It's possible also to set optional parameters for changing low level configuration query behaviour, like for example analyze_wildcard, fuzzy_max_expansions etc. These configuration settings must to be declared in this way: query_string('search criteria','field1,field2,object.*','analyze_wildcard:true','fuzzy_max_expansions:15'). Esqlj will dynamically cast params value type according to expected parameter Elastic query object.

Currently implemented raw Elastic queries:

Elastic query query_type Parameters Elastic reference
Query string query_string 1: query expression, 2: search on fields (* for all), 3..x: additional query parameters (see Elastic documentation) string_query
Geo bounding box geo_bounding_box 1: geopoint field, 2: top left latitude, 3: top left: longitude, 4: bottom right latitude, 5: bottom right longitude, 6..x extra params geo_bounding_box

Specific Elastic query functions samples

Query type Sample
query_string SELECT id, _score FROM indexName WHERE query_string('(new york city) OR (big apple) OR name:/joh?n(ath[oa]n)/', 'field1, field2,city.*', 'minimum_should_match:2')
geo_bounding_box SELECT _id, _score FROM indexName WHERE geo_bounding_box('geoPointField', 50, 8, 40.1, 10.2)

Functions

Function name Admitted on Notes
SYSDATE Right expression Current date time
SYSDATE() Right expression Current date time
NOW() Right expression Current date time
GETDATE() Right expression Current date time
TRUNC(SYSDATE|SYSDATE()) Right expression Current date
TO_DATE(date, mask_date) Right expression Supported mask: YEAR, YYYY, YY, MM, MONTH, MON, DDD, DD, HH24, HH12, HH, MI, SS, DAY, XFF, FFF, FF, F, PM, TZR, TZH. Example TO_DATE('2020/01/01', 'YYYY/MM/DD')
EXTRACT(PERIOD FROM column) Left expression PERIOD can be valued with YEAR, MONTH, DAY, HOUR, MINUTE, SECOND. Usage example: EXTRACT(YEAR FROM timestamp)!=2020

Group by

It's possible to aggregate values using GROUP BY clause.

Example: SELECT booleanField, AVG(integerField) test, SUM(longField), COUNT(doubleField), COUNT(*) from testIndex GROUP BY booleanField

Having

It's possible to apply filtering on Group by expressions.

Example:
SELECT booleanField, AVG(integerField) test, SUM(longField), COUNT(*) from testIndex GROUP BY booleanField HAVING AVG(integerField)>=4 OR SUM(longField)>=19

Order

Example:
SELECT * FROM column ORDER BY keywordField, integerField DESC

Limit

Example:
SELECT * FROM column LIMIT 100

Compatibility

Tested on 7.4.2 and 7.10.0 Elasticsearch release

About me

Fabrizio Pasti
fabrizio.pasti@gmail.com
https://www.linkedin.com/in/fabrizio-pasti-2340a627

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

elasticsearch jdbc驱动 展开 收起
Java
Apache-2.0
取消

发行版 (1)

全部

贡献者

全部

近期动态

加载更多
不能加载更多了
Java
1
https://gitee.com/gdouyang/esqlj-elastic-jdbc.git
git@gitee.com:gdouyang/esqlj-elastic-jdbc.git
gdouyang
esqlj-elastic-jdbc
esqlj-elastic-jdbc
es8

搜索帮助