[klibc] nfsmount default timeo=7 causes timeouts on 100 Mbps
alkisg at gmail.com
Sun Sep 15 00:51:07 PDT 2019
I think I got it.
Both nfsmount and `mount -t nfs` now default to rsize/wsize = 1 MB.
By lowering this to 32K, all issues are gone, even with the default
timeo=7. And nfsroot=xxx client responsiveness is a whole lot better.
I think when nfsmount was initially written, the default rsize/wsize
were much lower, which matched the timeo=7.
Now they cause the lags/timeouts that I reported.
So please, instead of increasing timeo, decrease the default rsize/wsize
(or, where does that 1 MB default come from, so that I report it there...)
On 9/15/19 9:48 AM, Alkis Georgopoulos wrote:
> I can't explain why 700 msecs aren't enough to avoid timeouts in 100
> Mbps networks, but my tests verify it, so I'm writing to the list to
> request that you increase the default timeo to at least 30, or to 600
> which is the default for `mount -t nfs`.
> How to reproduce:
> 1) Cabling:
> server <=> 100 Mbps switch <=> client
> Alternatively, one can use a 1000 Mbps switch and this command:
> ethtool -s enp3s0 speed 100 duplex full autoneg on
> 2) Server:
> apt install nfs-kernel-server
> echo '/srv *(ro,async,no_subtree_check)' >> /etc/exports
> exportfs -ra
> truncate -s 10G /srv/10G.file
> The sparse file ensures that disk IO bandwidth isn't an issue.
> 3) Client:
> /usr/lib/klibc/nfsmount -o timeo=7 192.168.1.112:/srv /mnt
> dd if=/mnt/10G.file of=/dev/null status=progress
> 4) Result:
> dd there starts with 11.2 MB/sec, which is fine/expected,
> and it slowly drops to 2 MB/sec after a while,
> it lags, omitting some seconds in its output line,
> e.g. 507510784 bytes (508 MB, 484 MiB) copied, 186 s, 2,7 MB/s^C,
> at which point "Ctrl+C" needs 30+ seconds to stop dd,
> because of IO waiting etc.
> In another terminal tab, `dmesg -w` is full of these:
> [ 316.404250] nfs: server 192.168.1.112 not responding, still trying
> [ 316.759512] nfs: server 192.168.1.112 OK
> By using the NFS mount command defaults, timeo=600 and retrans=2, dd is
> constantly at 11.2 MB/sec, Ctrl+C is instant, and there's nothing in dmesg.
> It is entirely possible that timeo=7 should be enough and I bumped into
> an NFS bug, but I'm not experienced enough to troubleshoot it more
> without help.
> If anyone can make timeo=7 work properly in 100 Mbps networks in any
> distribution/version, please tell me to test with that.
> I was testing with Ubuntu 18.04.3, kernel 4.15.
> Kind regards,
> Alkis Georgopoulos
More information about the klibc