I’m seeing issues with running elasticsearch on a glusterfs volume. I’ve configured Convoy with Glusterfs and the volume is mounted. All seems well.
When elasticsearch starts up, I see this:
4/3/2016 8:53:55 AM[2016-04-03 12:53:55,083][WARN ][cluster.action.shard ] [Nicole St. Croix] [.kibana][0] received shard failed for target shard [[.kibana][0], node[MlquNusiR2O-9Lc5x2dLeQ], [P], v[1], s[INITIALIZING], a[id=j_vQF1yPRMWPQR56c3Th_w], unassigned_info[[reason=INDEX_CREATED], at[2016-04-03T12:53:50.855Z]]], indexUUID [_4mgkEHzRxawb_2dE3vihA], message [failed recovery], failure [IndexShardRecoveryException[failed recovery]; nested: AlreadyClosedException[Underlying file changed by an external force at 2016-04-03T12:53:54.102783Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/elasticsearch/nodes/0/indices/.kibana/0/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],ctime=2016-04-03T12:53:54.102783Z))]; ] 4/3/2016 8:53:55 AM[.kibana][[.kibana][0]] IndexShardRecoveryException[failed recovery]; nested: AlreadyClosedException[Underlying file changed by an external force at 2016-04-03T12:53:54.102783Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/elasticsearch/nodes/0/indices/.kibana/0/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],ctime=2016-04-03T12:53:54.102783Z))]; 4/3/2016 8:53:55 AM at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:179)
I’ve seen recommendations for enabling “cluster.consistent-metadata” for a gluster volume, however, it seems that isn’t possible for the current shipped version of glusterfs with Rancher.
Any thoughts for this? I’m guessing I’ll just have to forgo this for sidekicks.
I have other volume mounts to the Convoy storage mounts that work great. Logs and configs for example. But it seems with Elasticsearch, there’s something with how often it writes/reads that Convoy/GlusterFS seems to interfere with.
I am seeing the same issue as you are. Running the ELK stack without using a shared volume through Convoy-Gluster works correctly, but with it I am seeing the same issues. Have you found a solution to this @bonovoxly?