SUSE Cloud 5: cloud-init fails

Hi guys,

I’d like to ask for your expertise again.
I have a working cloud now (1 admin & 1 control on SLES11, 2 compute nodes with SLES11 and SLES12), my instances get their IPs injected and I have SSH access. Next step is to test cloud-init.
The image I use is a SLES11-SP3 image including the cloud-init package. I tried several ways to provide a simple test script for the VMs: in the Horizon dashboard as direct input, as a file and as a parameter in CLI for nova boot. In the VM’s logs I see no output at all, the log file for cloud-init isn’t even created.
So I tried to start the init process manually with “cloud-init init” on the already running VM and got this result in the nova-api-metadata.log on control node:

2015-08-17 16:25:30.105 22341 INFO nova.network.driver [-] Loading network driver 'nova.network.linux_net' 2015-08-17 16:25:30.314 22341 INFO nova.openstack.common.periodic_task [-] Skipping periodic task _periodic_update_dns because its interval is negative 2015-08-17 16:25:30.319 22341 ERROR nova.wsgi [-] Could not bind to 192.168.124.81:8775 2015-08-17 16:25:30.319 22341 CRITICAL nova [-] error: [Errno 98] Address already in use 2015-08-17 16:25:30.319 22341 TRACE nova Traceback (most recent call last): 2015-08-17 16:25:30.319 22341 TRACE nova File "/usr/bin/nova-api-metadata", line 10, in <module> 2015-08-17 16:25:30.319 22341 TRACE nova sys.exit(main()) 2015-08-17 16:25:30.319 22341 TRACE nova File "/usr/lib64/python2.6/site-packages/nova/cmd/api_metadata.py", line 52, in main 2015-08-17 16:25:30.319 22341 TRACE nova server = service.WSGIService('metadata', use_ssl=should_use_ssl) 2015-08-17 16:25:30.319 22341 TRACE nova File "/usr/lib64/python2.6/site-packages/nova/service.py", line 351, in __init__ 2015-08-17 16:25:30.319 22341 TRACE nova max_url_len=max_url_len) 2015-08-17 16:25:30.319 22341 TRACE nova File "/usr/lib64/python2.6/site-packages/nova/wsgi.py", line 141, in __init__ 2015-08-17 16:25:30.319 22341 TRACE nova self._socket = eventlet.listen(bind_addr, family, backlog=backlog) 2015-08-17 16:25:30.319 22341 TRACE nova File "/usr/lib64/python2.6/site-packages/eventlet/convenience.py", line 39, in listen 2015-08-17 16:25:30.319 22341 TRACE nova sock.bind(addr) 2015-08-17 16:25:30.319 22341 TRACE nova File "<string>", line 1, in bind 2015-08-17 16:25:30.319 22341 TRACE nova error: [Errno 98] Address already in use 2015-08-17 16:25:30.319 22341 TRACE nova

In the VM’s logs there are two new files now, /var/log/cloud-init.log which is empty and /var/log/cloud-init-output.log which has following content:

Cloud-init v. 0.7.6 running 'init' at Mon, 17 Aug 2015 13:48:56 +0000. Up 1423.36 seconds.
ci-info: +++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++
ci-info: +--------+------+----------------+---------------+-------------------+
ci-info: | Device |  Up  |    Address     |      Mask     |     Hw-Address    |
ci-info: +--------+------+----------------+---------------+-------------------+
ci-info: |   lo   | True |   127.0.0.1    |   255.0.0.0   |         .         |
ci-info: |  eth0  | True | 192.168.123.49 | 255.255.255.0 | fa:16:3e:08:fd:e5 |
ci-info: +--------+------+----------------+---------------+-------------------+
ci-info: +++++++++++++++++++++++++++++++++Route info++++++++++++++++++++++++++++++++++
ci-info: +-------+---------------+---------------+---------------+-----------+-------+
ci-info: | Route |  Destination  |    Gateway    |    Genmask    | Interface | Flags |
ci-info: +-------+---------------+---------------+---------------+-----------+-------+
ci-info: |   0   |    0.0.0.0    | 192.168.123.1 |    0.0.0.0    |    eth0   |   UG  |
ci-info: |   1   |   127.0.0.0   |    0.0.0.0    |   255.0.0.0   |     lo    |   U   |
ci-info: |   2   |  169.254.0.0  |    0.0.0.0    |  255.255.0.0  |    eth0   |   U   |
ci-info: |   3   | 192.168.123.0 |    0.0.0.0    | 255.255.255.0 |    eth0   |   U   |
ci-info: +-------+---------------+---------------+---------------+-----------+-------+
2015-08-17 15:48:58,115 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [0/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by <class 'socket.error'>: [Errno 111] Connection refused)]
2015-08-17 15:48:59,119 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [1/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by <class 'socket.error'>: [Errno 111] Connection refused)]
2015-08-17 15:49:00,123 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [2/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by <class 'socket.error'>: [Errno 111] Connection refused)]

So I see one major difference: the VM is trying to receive metadata from 169.254.169.254 (it fails), but control node is trying to get it from itself (192.168.124.81:8775), which also fails. In the OpenStack docs in the section Metadata service is this:

I checked root@d0c-c4-7a-06-72-96:/var/log # curl http://192.168.124.81:8775 1.0 2007-01-19 2007-03-01 2007-08-29 2007-10-10 2007-12-15 2008-02-01 2008-09-01 2009-04-04 latest

but a timeout when trying root@d0c-c4-7a-06-72-96:~ # curl http://169.254.169.254/openstack curl: (7) couldn't connect to host

Now I’m really confused and have no idea what to do. Can anyone explain how it’s supposed to work and how I get my scripts onto my VMs?

Regards,
Eugen

I have to correct my statements.
The error in nova-api-metadata.log occurred because I tried to start the service openstack-nova-api-metadata manually as it was in status “unused” and I had read a post on that topic so I tried the proposed solution.
Then I started cloud-init again and got the errors there. So these results are not directly related to each other.
My question is: is the api-metadata daemon required to run at all or is nova-api managing everything? And is any daemon supposed to listen to 169.254.169.254? Or does that address in the client log indicate that some configuration (pointing to the right server) is missing?

Update:
Cloud-init works as expected now. All the time I was wandering why it was not executed at boot time of the created instance. The required services (cloud-init-local, cloud-init, cloud-config, cloud-final) were not enabled in the image I used for the VMs. So I enabled them, made a snapshot of the VM and used that as a new image, now cloud-init is executed during boot time. I just have one limitation here, this only works if I use config-drive to receive meta-data and user-data. If I don’t use config-drive, I get the error messages in /var/log/messages and in /var/log/cloud-init-output.log:

Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [12/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by <class 'socket.error'>: [Errno 111] Connection refused)]

So I still can’t reach the metadata without config-drive. This refers to the post here. It would be great if someone could tell me, what I’m missing. Is my cloud misconfigured? Is it a bug?

Regards,
Eugen