[klibc] nfsmount default timeo=7 causes timeouts on 100 Mbps
alkisg at gmail.com
Thu Sep 19 21:58:08 PDT 2019
In case anyone's interested, I followed up in the linux-nfs mailing list:
On 9/15/19 10:51 AM, Alkis Georgopoulos wrote:
> I think I got it.
> Both nfsmount and `mount -t nfs` now default to rsize/wsize = 1 MB.
> By lowering this to 32K, all issues are gone, even with the default
> timeo=7. And nfsroot=xxx client responsiveness is a whole lot better.
> I think when nfsmount was initially written, the default rsize/wsize
> were much lower, which matched the timeo=7.
> Now they cause the lags/timeouts that I reported.
> So please, instead of increasing timeo, decrease the default rsize/wsize
> to 32K.
> (or, where does that 1 MB default come from, so that I report it there...)
> On 9/15/19 9:48 AM, Alkis Georgopoulos wrote:
>> I can't explain why 700 msecs aren't enough to avoid timeouts in 100
>> Mbps networks, but my tests verify it, so I'm writing to the list to
>> request that you increase the default timeo to at least 30, or to 600
>> which is the default for `mount -t nfs`.
>> How to reproduce:
>> 1) Cabling:
>> server <=> 100 Mbps switch <=> client
>> Alternatively, one can use a 1000 Mbps switch and this command:
>> ethtool -s enp3s0 speed 100 duplex full autoneg on
>> 2) Server:
>> apt install nfs-kernel-server
>> echo '/srv *(ro,async,no_subtree_check)' >> /etc/exports
>> exportfs -ra
>> truncate -s 10G /srv/10G.file
>> The sparse file ensures that disk IO bandwidth isn't an issue.
>> 3) Client:
>> /usr/lib/klibc/nfsmount -o timeo=7 192.168.1.112:/srv /mnt
>> dd if=/mnt/10G.file of=/dev/null status=progress
>> 4) Result:
>> dd there starts with 11.2 MB/sec, which is fine/expected,
>> and it slowly drops to 2 MB/sec after a while,
>> it lags, omitting some seconds in its output line,
>> e.g. 507510784 bytes (508 MB, 484 MiB) copied, 186 s, 2,7 MB/s^C,
>> at which point "Ctrl+C" needs 30+ seconds to stop dd,
>> because of IO waiting etc.
>> In another terminal tab, `dmesg -w` is full of these:
>> [ 316.404250] nfs: server 192.168.1.112 not responding, still trying
>> [ 316.759512] nfs: server 192.168.1.112 OK
>> By using the NFS mount command defaults, timeo=600 and retrans=2, dd
>> is constantly at 11.2 MB/sec, Ctrl+C is instant, and there's nothing
>> in dmesg.
>> It is entirely possible that timeo=7 should be enough and I bumped
>> into an NFS bug, but I'm not experienced enough to troubleshoot it
>> more without help.
>> If anyone can make timeo=7 work properly in 100 Mbps networks in any
>> distribution/version, please tell me to test with that.
>> I was testing with Ubuntu 18.04.3, kernel 4.15.
>> Kind regards,
>> Alkis Georgopoulos
More information about the klibc