Glusterfs and elasticsearch

I’m seeing issues with running elasticsearch on a glusterfs volume. I’ve configured Convoy with Glusterfs and the volume is mounted. All seems well.

When elasticsearch starts up, I see this:

4/3/2016 8:53:55 AM[2016-04-03 12:53:55,083][WARN ][cluster.action.shard ] [Nicole St. Croix] [.kibana][0] received shard failed for target shard [[.kibana][0], node[MlquNusiR2O-9Lc5x2dLeQ], [P], v[1], s[INITIALIZING], a[id=j_vQF1yPRMWPQR56c3Th_w], unassigned_info[[reason=INDEX_CREATED], at[2016-04-03T12:53:50.855Z]]], indexUUID [_4mgkEHzRxawb_2dE3vihA], message [failed recovery], failure [IndexShardRecoveryException[failed recovery]; nested: AlreadyClosedException[Underlying file changed by an external force at 2016-04-03T12:53:54.102783Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/elasticsearch/nodes/0/indices/.kibana/0/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],ctime=2016-04-03T12:53:54.102783Z))]; ] 4/3/2016 8:53:55 AM[.kibana][[.kibana][0]] IndexShardRecoveryException[failed recovery]; nested: AlreadyClosedException[Underlying file changed by an external force at 2016-04-03T12:53:54.102783Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/elasticsearch/nodes/0/indices/.kibana/0/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],ctime=2016-04-03T12:53:54.102783Z))]; 4/3/2016 8:53:55 AM at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:179)

I’ve seen recommendations for enabling “cluster.consistent-metadata” for a gluster volume, however, it seems that isn’t possible for the current shipped version of glusterfs with Rancher.

Any thoughts for this? I’m guessing I’ll just have to forgo this for sidekicks.

Are you launching your own version of Elastic search or how are you trying to connect it to glusterfs?

Can you provide the docker-compose.yml of how you are trying to link elastic search?

Sure. I’m actually using it through Convoy, a volume mount.

Here’s part of the docker-compose:

`elasticsearch:
ports:

  • 9200:9200/tcp
  • 9300:9300/tcp
    labels:
    io.rancher.scheduler.affinity:container_label_soft_ne: io.rancher.stack.name=$${stack_name}
    command:
  • elasticsearch
  • -Des.network.host=0.0.0.0
    image: elasticsearch:latest
    volume-driver: convoy-gluster
    volumes:
  • elasticsearch-data:/usr/share/elasticsearch/data`

I have other volume mounts to the Convoy storage mounts that work great. Logs and configs for example. But it seems with Elasticsearch, there’s something with how often it writes/reads that Convoy/GlusterFS seems to interfere with.

I am seeing the same issue as you are. Running the ELK stack without using a shared volume through Convoy-Gluster works correctly, but with it I am seeing the same issues. Have you found a solution to this @bonovoxly?

Anyone get this working yet?

@rsmith and @davidcunningham

I’ve been working on a few other things and haven’t revisited this. :disappointed:

@bonovoxly
The same issue with glusterfs 3.10.1, haven’t found any solutions.

This happens because the ES compares ctime to check the lock file unchanged:

In GlusterFS, it returns ctime from one of the multiple backend bricks so the ctime varies:

https://bugzilla.redhat.com/show_bug.cgi?id=1318493

As a result, ES believes the file is changed by someone else.