Linux Admin Reports SSH Not working due to resolv.conf

I work at walgreens corporate. I do environment management and recently build a dns server for use on a particular project.
It’s been in place and working for a couple of months.
Recently an entry was added to an application server (running suse 10.2) to point first to my new dns server (suse 11.3) for reverse entry lookups via nslookup.
Subsequent to that change, an admin has begun to complain that ssh logins to that app server are no longer working and says it’s due to the entry for my dns server being there.
I need some help figuring out how to prove that is or is not the case.
What I have done is shut down dns on my server and I personally still cannot login to the app server in question. I get the login prompt fine, so it’s not firewall, but it’s not accepting my password (reset yesterday for all unix/linux servers I have access to) with any ID I have.
So, again, I’m not at all convinced my dns server, nor the resolv.conf entry for it on the app server in question are in any way related to the ssh problem.
Any ideas?

In the first instance, have you enabled the debug when trying your
connection with the -v command, more v’s more debug, eg;

ssh -vv username@host

Cheers Malcolm °¿° LFCS, SUSE Knowledge Partner (Linux Counter #276890)
openSUSE 13.1 (Bottle) (x86_64) GNOME 3.10.1 Kernel 3.11.10-21-desktop
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!

Also, how long have you waited? The problems with reverse lookups that
cause SSH delays which I have seen have not been permanent failures, but
instead have been long delays as the server tries to do a lookup of the
client, and that can timeout for many seconds (has felt like thirty in the
past). I suppose there could be a way for a permanent failure, but along
with Malcolm’s -vv option (I usually use -vvv for that extra level when I
am troubleshooting and that works nicely for me) also check the server
side in /var/log/messages to see if anything notable shows up in there.
Another thing to try may be to add a line for the client’s IP address
directly in the server’s /etc/hosts file to see if that makes the lookup
really fast. I do this at my current job because a server here doesn’t
find my home IP quickly-enough for me, and it made an immediate difference
in the time to login.

Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…

As far as how to “prove” that the DNS entry in resolv.conf is or isn’t causing the problem, simple trial and error should confirm that. Remove the entry. Does the problem go away? Put it back. Does it return?

But if you want additional methods of investigating this, my suggestions would be:

  1. Tell sshd not to use DNS and see how that effects the situation. Edit /etc/ssh/sshd_config (on the sshd server in question… not the client) and set “UseDNS no”. Then restart sshd with “rcsshd restart”. Does the problem go away? If so, that’s pretty good proof. If not, consider it to be inconclusive.

  2. You could take a tcpdump trace at the sshd server machine while the client attempts to connect. Do not filter it on ssh server/client ip addresses, because you’ll want to also capture whatever DNS traffic is going in whatever direction. If analysis of the tcpdump shows DNS queries regarding the ssh client’s IP address going to a certain server and receiving no answer, then that would prove something.

  3. You could also put the ip address and name of the ssh client machine in question into the /etc/hosts file on the sshd server. That would relieve the need to make a DNS query for that client. If you do that, and ssh connection from that client start working, this is also good proof that the DNS aspect of this is having trouble.