Integration with Elasticsearch-Hadoop

Integration with Elasticsearch-Hadoop #

Elasticsearch-Hadoop utilizes a seed node to access all back-end Elasticsearch nodes by default. The hotspots and requests may be improperly allocated. To improve the resource utilization of back-end Elasticsearch nodes, you can implement precision routing for the access to Elasticsearch nodes through INFINI Gateway.

Write Acceleration #

If you import data by using Elasticsearch-Hadoop, you can modify the following parameters of Elasticsearch-Hadoop to access INFINI Gateway, so as to improve the write throughput:

NameTypeDescription
es.nodesstringList of addresses used to access the gateway, for example, localhost:8000,localhost:8001
es.nodes.discoveryboolWhen it is set to false, the sniff mode is not adopted and only the configured back-end nodes are accessed.
es.nodes.wan.onlyboolWhen it is set to true, it indicates the proxy mode, in which data is forcibly sent through the gateway address.
es.batch.size.entriesintBatch document quantity. Set the parameter to a larger value to improve throughput, for example, 5000.
es.batch.size.bytesstringBatch transmission size. Set the parameter to a larger value to improve throughput, for example, 20mb.
es.batch.write.refreshboolSet it to false to prevent active refresh and improve throughput.