Single HDD ceph cluster

Hi.

I’m in charge of setting up a test ceph cluster in our institution. The hardware where it must fit in is composed by 6 single hdd computers (adding or moving around disks is absolutely not an option).

As a consequence, I partitioned the disks to have the operating system (openSuse Leap 15) and a free (unmounted) partition to test ceph as follows:

#######
# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 465,8G  0 disk 
├─sda1   8:1    0   500M  0 part /boot/efi
├─sda2   8:2    0    16G  0 part [SWAP]
├─sda3   8:3    0  49,3G  0 part /
└─sda4   8:4    0   400G  0 part 
sr0     11:0    1   3,7G  0 rom
#######

Our intention is that sda4 is used by the ceph cluster to store data that will be spread among the 6 machines, being accessible by all while providing some level of redundancy (losing two computers will keep data available, etc).

Among some difficulties I faced in the different stages of deploying it, we had the same issue with empty profile-default/stack/default/ceph/minions/node*.yml files that vazaari reported on https://forums.suse.com/showthread.php?11788-state-orch-ceph-stage-discovery-does-not-collect-HDD-info and solved in a similar fashion with handly edited files:

#######
# salt -I 'roles:storage' pillar.get ceph
node02:
    ----------
    storage:
        ----------
        osds:
            ----------
            /dev/sda4:
                ----------
                format:
                    bluestore
                standalone:
                    True
(and so on for all 6 machines)
#######

Please note that the device points straight to the sda4 partition, instead of just to the sdb device as in vazaari’s thread example.

After this, everything goes smoothly until I run:

#######
# salt-run state.orch ceph.stage.deploy
(...)
[14/71]   ceph.sysctl on
          node01....................................... ✓ (0.5s)
          node02........................................ ✓ (0.7s)
          node03....................................... ✓ (0.6s)
          node04......................................... ✓ (0.5s)
          node05....................................... ✓ (0.6s)
          node06.......................................... ✓ (0.5s)

[15/71]   ceph.osd on
          node01...................................... ❌ (0.7s)
          node02........................................ ❌ (0.7s)
          node03....................................... ❌ (0.7s)
          node04......................................... ❌ (0.6s)
          node05....................................... ❌ (0.6s)
          node06.......................................... ❌ (0.7s)

Ended stage: ceph.stage.deploy succeeded=14/71 failed=1/71 time=624.7s

Failures summary:

ceph.osd (/srv/salt/ceph/osd):
  node02:
    deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node02 for cephdisks.list
  node03:
    deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node03 for cephdisks.list
  node01:
    deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node01 for cephdisks.list
  node04:
    deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node04 for cephdisks.list
  node05:
    deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node05 for cephdisks.list
  node06:
    deploy OSDs: Module function osd.deploy threw an exception. Exception: Mine on node06 for cephdisks.list
#######

Which is where I got stucked. As extra information here follows our “policy.cfg” file:

#########
cluster-ceph/cluster/*.sls
profile-default/cluster/*.sls
profile-default/stack/default/ceph/minions/*yml
config/stack/default/global.yml
config/stack/default/ceph/cluster.yml
role-master/cluster/node01.sls
role-admin/cluster/*.sls
role-mon/cluster/*.sls
role-mgr/cluster/*.sls
role-mds/cluster/*.sls
role-ganesha/cluster/*.sls
role-client-nfs/cluster/*.sls
role-client-cephfs/cluster/*.sls
##########

Can anybody help me here with this issue? Where precisely am I most probably messing things up? What should I look for? Should I have not partitioned the sda device up to sda4 and left the space for ceph to find it? Or maybe and unfortunately it is simply not feasible with openSuse Leap + Ceph?

Thanks a lot in advance for any help provided!

Sincerely yours,

Jones

Besides the partition issue, in the mailing list you didn’t mention you were trying to use Leap with SUSE Enterprise Storage. SES is based on SLES, the current SES 5 release (based on Luminous) runs on SLES12-SP3. Running it on Leap will very likely result in other problems, I’m surprised you even got this far. :wink:
Is it a test subscription for SES? Then you’ll need another one for SLES, too, and then re-install the nodes, although this will not resolve the partition issue.

[QUOTE=eblock;54152]Besides the partition issue, in the mailing list you didn’t mention you were trying to use Leap with SUSE Enterprise Storage. SES is based on SLES, the current SES 5 release (based on Luminous) runs on SLES12-SP3. Running it on Leap will very likely result in other problems, I’m surprised you even got this far. :wink:
Is it a test subscription for SES? Then you’ll need another one for SLES, too, and then re-install the nodes, although this will not resolve the partition issue.[/QUOTE]
Hi eblock.

Actually, I’ve already ran into broken bricks when accidentally messed the repositories and installed SES packages mixed with Leap.

So, I’m not using SES, but rather installed all ceph and salt official, related and needed packages from openSuse Leap 15 repositories. Also, I’m (mostly) following the instructions from the openSuse manuals for ceph (I’ll need some time to get the precise link at this moment, however: traffic jam).

So, despite the fact that I’m not precisely using SES but it’s “core components”, could someone help me to create a functional ceph cluster install on a specific partition instead of a defined device? Is that even feasible?

Thanks a lot in advance.

Just for completeness sake I’ll paste my answer from the ML:

During OSD deployment SES creates two partitions on the OSD drive, so it will look something like this:

vdb 253:16 0 5G 0 disk ├─vdb1 253:17 0 100M 0 part /var/lib/ceph/osd/ceph-1 └─vdb2 253:18 0 4,9G 0 part
That’s why the OSD creation fails in your setup.

Hi Johannes,

[QUOTE=johannesrs;54154]Hi eblock.

Actually, I’ve already ran into broken bricks when accidentally messed the repositories and installed SES packages mixed with Leap.

So, I’m not using SES, but rather installed all ceph and salt official, related and needed packages from openSuse Leap 15 repositories[/QUOTE]

the lack of responses (other than Eugen’s) is likely related to the fact that this forum is for SES (the product) related questions. So the Ceph mailing list is probably still the best place to receive answers related to your current setup. If you were redirected here by Eugen, then probably because he thought you’re running SES on SLES - at least that’s how I read his statement in his earlier reply.

Regards,
J

As I already wrote in the mailing list you could achieve to build your cluster with LVM. But that means that you don’t have automation by deepsea and you’ll have to setup the cluster manually, which is not a big deal, actually.

You could create a logical volume on your spare partition and deploy OSDs with ceph-volume lvm.

[CODE]# create logical volume “osd4” on volume group “vg0”
ceph-2:~ # lvcreate -n osd4 -L 1G vg0
Logical volume “osd4” created.

prepare lvm for bluestore

ceph-2:~ # ceph-volume lvm prepare --bluestore --data vg0/osd4
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 3b9eaa0e-9a4a-49ec-9042-34ad19a59592
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-4
→ Absolute path not found for executable: restorecon
→ Ensure $PATH environment variable contains common executable locations
Running command: /bin/chown -h ceph:ceph /dev/vg0/osd4
Running command: /bin/chown -R ceph:ceph /dev/dm-4
Running command: /bin/ln -s /dev/vg0/osd4 /var/lib/ceph/osd/ceph-4/block
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-4/activate.monmap
stderr: got monmap epoch 2
Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-4/keyring --create-keyring --name osd.4 --add-key AQD3j49bDzsFIBAAsXQjhbwqFQwt/Vqq9VOnsw==
stdout: creating /var/lib/ceph/osd/ceph-4/keyring
added entity osd.4 auth auth(auid = 18446744073709551615 key=AQD3j49bDzsFIBAAsXQjhbwqFQwt/Vqq9VOnsw== with 0 caps)
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4/keyring
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4/
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 4 --monmap /var/lib/ceph/osd/ceph-4/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-4/ --osd-uuid 3b9eaa0e-9a4a-49ec-9042-34ad19a59592 --setuser ceph --setgroup ceph
→ ceph-volume lvm prepare successful for: vg0/osd4

activate lvm OSD

ceph-2:~ # ceph-volume lvm activate 4 3b9eaa0e-9a4a-49ec-9042-34ad19a59592
Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/vg0/osd4 --path /var/lib/ceph/osd/ceph-4 --no-mon-config
Running command: /bin/ln -snf /dev/vg0/osd4 /var/lib/ceph/osd/ceph-4/block
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-4/block
Running command: /bin/chown -R ceph:ceph /dev/dm-4
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4
Running command: /bin/systemctl enable ceph-volume@lvm-4-3b9eaa0e-9a4a-49ec-9042-34ad19a59592
stderr: Created symlink /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-4-3b9eaa0e-9a4a-49ec-9042-34ad19a59592.service → /usr/lib/systemd/system/ceph-volume@.service.
Running command: /bin/systemctl enable --runtime ceph-osd@4
stderr: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@4.service → /usr/lib/systemd/system/ceph-osd@.service.
Running command: /bin/systemctl start ceph-osd@4
→ ceph-volume lvm activate successful for osd ID: 4[/CODE]

or run “prepare” and “activate” at the same time using “ceph-volume lvm create”:

ceph-2:~ # ceph-volume lvm create --bluestore --data vg0/osd4 Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new a036876f-4cfb-4254-ae3f-52d1ea75b31a Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-4 --> Absolute path not found for executable: restorecon --> Ensure $PATH environment variable contains common executable locations Running command: /bin/chown -h ceph:ceph /dev/vg0/osd4 Running command: /bin/chown -R ceph:ceph /dev/dm-4 Running command: /bin/ln -s /dev/vg0/osd4 /var/lib/ceph/osd/ceph-4/block Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-4/activate.monmap stderr: got monmap epoch 2 Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-4/keyring --create-keyring --name osd.4 --add-key AQCqsY9bvQdVDhAAHTPvt8hrQem8O8D+v7WGaw== stdout: creating /var/lib/ceph/osd/ceph-4/keyring stdout: added entity osd.4 auth auth(auid = 18446744073709551615 key=AQCqsY9bvQdVDhAAHTPvt8hrQem8O8D+v7WGaw== with 0 caps) Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4/keyring Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4/ Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 4 --monmap /var/lib/ceph/osd/ceph-4/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-4/ --osd-uuid a036876f-4cfb-4254-ae3f-52d1ea75b31a --setuser ceph --setgroup ceph --> ceph-volume lvm prepare successful for: vg0/osd4 Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/vg0/osd4 --path /var/lib/ceph/osd/ceph-4 --no-mon-config Running command: /bin/ln -snf /dev/vg0/osd4 /var/lib/ceph/osd/ceph-4/block Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-4/block Running command: /bin/chown -R ceph:ceph /dev/dm-4 Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4 Running command: /bin/systemctl enable ceph-volume@lvm-4-a036876f-4cfb-4254-ae3f-52d1ea75b31a stderr: Created symlink /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-4-a036876f-4cfb-4254-ae3f-52d1ea75b31a.service → /usr/lib/systemd/system/ceph-volume@.service. Running command: /bin/systemctl enable --runtime ceph-osd@4 Running command: /bin/systemctl start ceph-osd@4 --> ceph-volume lvm activate successful for osd ID: 4 --> ceph-volume lvm create successful for: vg0/osd4

These steps require an existing cluster with MON (and MGR) deployed, you can find the docs for manual deployment here. You can also use ceph-deploy (docs here) to create your cluster, but I would recommend to not deploy OSDs with ceph-deploy as I’m not sure the tool supports deploying OSDs on logical volumes. But since you only have a couple of OSDs that should be manageable.