Hi,
I have four nodes (1x admin, 3x osd, mons,…) prepared for the CEPH cluster.
For some reason discovery stage does not create templates for the osd nodes.
stage.0 completes successfully.
stage.1 completes successfully but osd templates are missing:
[CODE]
ls -la profile-default/stack/default/ceph/minions/
total 0
drwxr-xr-x 1 salt salt 0 Mar 13 16:03 .
drwxr-xr-x 1 salt salt 14 Mar 13 16:03 …[/CODE]
[CODE]
ls -la profile-default/cluster/
total 0
drwxr-xr-x 1 salt salt 0 Mar 13 16:03 .
drwxr-xr-x 1 salt salt 38 Mar 13 16:03 …[/CODE]
Deepsea monitor output for stage.1:
Starting stage: ceph.stage.1
Parsing ceph.stage.1 steps... ✓
Stage initialization output:
salt-api : valid
deepsea_minions : valid
master_minion : valid
ceph_version : valid
[1/4] minions.ready(timeout=300)................................. ✓ (0.4s)
[2/4] ceph.refresh on
tw-ceph-admin.............................................. ✓ (0.3s)
[3/4] populate.proposals......................................... ✓ (5s)
[4/4] proposal.populate.......................................... ✓ (1s)
Ended stage: ceph.stage.1 succeeded=4/4 time=28.3s[/CODE]
Master can see all minions:
[CODE]# salt-key -L
Accepted Keys:
tw-ceph-admin
tw-ceph-node1
tw-ceph-node2
tw-ceph-node3
Denied Keys:
Unaccepted Keys:
Rejected Keys:[/CODE]
All (pretended to be) osd nodes have per 3 unformated hdd:
[CODE]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 1 136.6G 0 disk
├─sda1 8:1 1 7M 0 part
├─sda2 8:2 1 2G 0 part [SWAP]
└─sda3 8:3 1 134.6G 0 part /
sdb 8:16 1 136.6G 0 disk
sdc 8:32 1 136.6G 0 disk
sr0 11:0 1 1024M 0 rom [/CODE]
Please assist to find out what am I doing wrong?
Additional info:
[CODE]# lsb_release -a
LSB Version: n/a
Distributor ID: SUSE
Description: SUSE Linux Enterprise Server 12 SP3
Release: 12.3
Codename: n/a
[CODE]
zypper se -s --installed-only | grep ses-release
i+ | ses-release | package | 5-1.54 | x86_64 | SES5
i+ | ses-release | package | 5-1.54 | x86_64 | SUSE-Enterprise-Storage-5-Pool
i | ses-release-cd | package | 5-1.54 | x86_64 | SES5 [/CODE]
Weird warning:
salt-master[28198]: [WARNING ] Although 'dmidecode' was found in path, the current user cannot execute it. Grains output might not be accurate.
Firewall stopped, apparmor disabled.
brg,
Serhiy.
Stage 0
[CODE]Starting stage: ceph.stage.0
Parsing ceph.stage.0 steps… ✓
Stage initialization output:
deepsea_minions : valid
master_minion : valid
ceph_version : valid
[1/14] ceph.salt-api on
tw-ceph-admin… ✓ (5s)
[2/14] ceph.sync on
tw-ceph-admin… ✓ (1s)
[3/14] ceph.repo on
tw-ceph-admin… ✓ (0.8s)
[4/14] ceph.updates on
tw-ceph-admin… ✓ (10s)
[5/14] filequeue.remove(item=lock)… ✓ (0.0s)
[6/14] ceph.updates.restart on
tw-ceph-admin… ✓ (2s)
[7/14] filequeue.add(item=complete)… ✓ (0.0s)
[8/14] minions.ready(timeout=300)… ✓ (0.4s)
[9/14] ceph.repo on
tw-ceph-node2… ✓ (0.3s)
tw-ceph-node3… ✓ (0.3s)
tw-ceph-node1… ✓ (0.3s)
tw-ceph-admin… ✓ (0.3s)
[10/14] ceph.packages.common on
tw-ceph-node2… ✓ (2s)
tw-ceph-node3… ✓ (2s)
tw-ceph-node1… ✓ (2s)
tw-ceph-admin… ✓ (3s)
[11/14] ceph.sync on
tw-ceph-node2… ✓ (1.0s)
tw-ceph-node3… ✓ (1s)
tw-ceph-node1… ✓ (1s)
tw-ceph-admin… ✓ (1s)
[12/14] ceph.mines on
tw-ceph-node2… ✓ (2s)
tw-ceph-node3… ✓ (2s)
tw-ceph-node1… ✓ (2s)
tw-ceph-admin… ✓ (2s)
[13/14] ceph.updates on
tw-ceph-node2… ✓ (19s)
tw-ceph-node3… ✓ (16s)
tw-ceph-node1… ✓ (21s)
tw-ceph-admin… ✓ (11s)
[14/14] ceph.updates.restart on
tw-ceph-node2… ✓ (3s)
tw-ceph-node3… ✓ (3s)
tw-ceph-node1… ✓ (3s)
tw-ceph-admin… ✓ (3s)
Ended stage: ceph.stage.0 succeeded=14/14 time=92.7s[/CODE]
Seems the problem is the lack of storage nodes and osds.
So I can generate templates with the following command:
salt-run proposal.populate leftovers=True standalone=True target='tw-ceph-node*'
but it generates empty osds list:
# cat profile-default/stack/default/ceph/minions/tw-ceph-node1.yml
ceph:
storage:
osds: {}
Could you provide sample of .yml for storage role?
[QUOTE=vazaari;51528]Seems the problem is the lack of storage nodes and osds.
So I can generate templates with the following command:
salt-run proposal.populate leftovers=True standalone=True target='tw-ceph-node*'
but it generates empty osds list:
# cat profile-default/stack/default/ceph/minions/tw-ceph-node1.yml
ceph:
storage:
osds: {}
Could you provide sample of .yml for storage role?[/QUOTE]
Here is an sample:
ceph:
storage:
osds:
/dev/disk/by-id/ata-ST4000VN0001-1SF178_Z4F0PS49:
db: /dev/disk/by-id/nvme-20000000001000000e4d25c0bf42f4c00
db_size: 500m
format: bluestore
wal: /dev/disk/by-id/nvme-20000000001000000e4d25c0bf42f4c00
wal_size: 500m
…And so on for next osds.
Thomas
ceph:
storage:
osds:
/dev/disk/by-id/ata-ST4000VN0001-1SF178_Z4F0PS49:
db: /dev/disk/by-id/nvme-20000000001000000e4d25c0bf42f4c00
db_size: 500m
format: bluestore
wal: /dev/disk/by-id/nvme-20000000001000000e4d25c0bf42f4c00
wal_size: 500m
Reposting with better “layout”
Thomas
Thomas, thanks a lot.
I’ve created the the yml with the following content:
ceph:
storage:
osds:
/dev/sdb:
format: bluestore
standalone: true
/dev/sdc:
format: bluestore
standalone: true
stage.2 passed successfully
stage.3 ends with errors
Module function osd.deploy threw an exception. Exception: Mine on tw-ceph-node1 for cephdisks.list
hdds are still not recognized:
# salt 'tw-ceph-node*' cephdisks.list
tw-ceph-node2:
tw-ceph-node3:
tw-ceph-node1:
Nodes itself see the disks, f.e.:
hwinfo --disk | egrep 'sdb|sdc'
SysFS ID: /class/block/sdb
Device File: /dev/sdb (/dev/sg1)
Device Files: /dev/sdb, /dev/disk/by-id/scsi-1ADAPTEC_ARRAY_4022A45B, /dev/disk/by-id/scsi-25ba4224000d00000, /dev/disk/by-id/scsi-SServeRA_disk1_4022A45B, /dev/disk/by-path/pci-0000:04:00.0-scsi-0:0:1:0
SysFS ID: /class/block/sdc
Device File: /dev/sdc (/dev/sg2)
Device Files: /dev/sdc, /dev/disk/by-id/scsi-1ADAPTEC_ARRAY_354EA45B, /dev/disk/by-id/scsi-25ba44e3500d00000, /dev/disk/by-id/scsi-SServeRA_disk2_354EA45B, /dev/disk/by-path/pci-0000:04:00.0-scsi-0:0:2:0
Seems that deepsea does not like hdds under raid controllers or some of them.
In my case raid controller does not support jbod mode so I have to create volumes which recognised by host as ordinary hdds.
I’ve tried to deploy ceph with ‘ceph-deploy’ and it also failed to create osds…
But I’ve created one partition per each hdd and osds had been created successfully with
# ceph-deploy osd create tw-ceph-nodeY --data /dev/sdX1
After that I played a little bit with destroy, purge,… and get success with unpartitioned disk:
Now I’m trying to ‘migrate’ ceph-deploy to deepsea (as in the case of ses[3,4] to ses5 ).
I successfuly have passed stages 0,1,2, edited /srv/modules/runners/validate.py to bypass ‘4 nodes’ requirement and stuck on the stage 3:
[CODE][13/44] ceph.osd.auth on
tw-ceph-admin… (2s)
Ended stage: ceph.stage.3 succeeded=12/44 failed=1/44 time=85.1s
Failures summary:
ceph.osd.auth (/srv/salt/ceph/osd/auth):
tw-ceph-admin:
auth /srv/salt/ceph/osd/cache/bootstrap.keyring: Command “ceph auth add client.bootstrap-osd -i /srv/salt/ceph/osd/cache/bootstrap.keyring” run
stdout:
stderr: Error EINVAL: entity client.bootstrap-osd exists but caps do not match[/CODE]
The following trick helps to solve ‘caps do not match’ and pass the stage 3:
# ceph auth caps client.bootstrap-osd mgr \\
"allow r" mon "allow profile bootstrap-osd"
Closer and closer…
Faced the igw issue with stage 4:
[CODE][6/16] ceph.igw on
tw-ceph-node1… (4s)
…
Ended stage: ceph.stage.4 succeeded=15/16 failed=1/16 time=547.9s
…
Failures summary:
ceph.igw (/srv/salt/ceph/igw):
tw-ceph-node1:
reload lrbd: Module function service.restart executed
enable lrbd: Service lrbd has been enabled, and is dead
[/CODE]
Not yet solved.
[QUOTE=vazaari;51561]Closer and closer…
Faced the igw issue with stage 4:
[CODE][6/16] ceph.igw on
tw-ceph-node1… (4s)
…
Ended stage: ceph.stage.4 succeeded=15/16 failed=1/16 time=547.9s
…
Failures summary:
ceph.igw (/srv/salt/ceph/igw):
tw-ceph-node1:
reload lrbd: Module function service.restart executed
enable lrbd: Service lrbd has been enabled, and is dead
[/CODE]
Not yet solved.[/QUOTE]
Maybe this is what you encountered: https://www.novell.com/support/kb/doc.php?id=7018668
Thomas
[QUOTE=thsundel;51566]Maybe this is what you encountered: https://www.novell.com/support/kb/doc.php?id=7018668
Thomas[/QUOTE]
I’ve recreated rdb & iscsi from the openAttic.