[one-users] How to fix hosts with deleted /var/tmp/one files

Jaime Melis jmelis at opennebula.org
Mon Nov 17 02:14:24 PST 2014


Hi Steven,

In several occasions in the past I have forgotten to
> tell my tmpwatch utility not to delete files out of /var/tmp/one
> and thus I have seen most of the remotes get deleted off of
> my cloud hosts, but opennebula 4.8 doesn't report any problems.


You are right, when we decided to set /var/tmp/one as the default remotes
directory we weren't aware of the tmpwatch utility.

However, in our case, when we remove /var/tmp/one the driver resends the
probes automatically:

Mon Nov 17 11:07:04 2014 [Z0][InM][D]: Monitoring host localhost (0)
Mon Nov 17 11:07:05 2014 [Z0][InM][I]: Command execution fail: 'if [ -x
"/var/tmp/one/im/run_probes" ]; then /var/tmp/one/im/run_probes kvm
/var/lib/one//datastores 4124 20 0 localhost; else
     exit 42; fi'
Mon Nov 17 11:07:05 2014 [Z0][InM][I]: ExitCode: 42
Mon Nov 17 11:07:05 2014 [Z0][InM][I]: Remote worker node files not found
Mon Nov 17 11:07:05 2014 [Z0][InM][I]: Updating remotes
Mon Nov 17 11:07:10 2014 [Z0][InM][D]: Host localhost (0) successfully
monitored.

Can you find something similar to that in your log file? If you are not
seeing something like that there might be a bug somewhere or a
configuration problem. It would be great to narrow it down.


>
I try to fix the remotes by doing "onehost sync" and nothing happens.
> I try to disable/enable the host and nothing happens there either.
> The only thing that seems to revive it is if I actually
> onehost delete/ onehost create.
>

Right, you should do "onehost sync --force"

Take a look at this section:
http://docs.opennebula.org/4.10/administration/hosts_and_clusters/host_guide.html#sync

In a nutshell you can version the probes and update specific nodes/cluster.
If there is no version change onehost sync will not do anything unless
--force is used.

IS it possible for the remotes to add an integrity check?
> They should not be reporting the host is "on" if the datastores
> directory of remotes has been totally deleted.
>

The automatic redeployment of probes is our way to deal with this. Let's
see if we can see why it's no working for you.


> Also is it possible to configure the directory other than the
> default /var/tmp/one?


Yes:

/etc/one/oned.conf:71:SCRIPTS_REMOTE_DIR=/var/tmp/one


> Finally is it possible to
> redistribute the remotes short of a onehost delete / onehost create?
>

With the --force flag.

Cheers,
Jaime


-- 
Jaime Melis
Project Engineer
OpenNebula - Flexible Enterprise Cloud Made Simple
www.OpenNebula.org | jmelis at opennebula.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20141117/64754585/attachment.htm>


More information about the Users mailing list