[klibc] nfsmount default timeo=7 causes timeouts on 100 Mbps

Alkis Georgopoulos alkisg at gmail.com
Thu Sep 19 21:58:08 PDT 2019


In case anyone's interested, I followed up in the linux-nfs mailing list:
https://marc.info/?l=linux-nfs&m=156887818618861&w=2

Thanks,
Alkis


On 9/15/19 10:51 AM, Alkis Georgopoulos wrote:
> I think I got it.
> 
> Both nfsmount and `mount -t nfs` now default to rsize/wsize = 1 MB.
> By lowering this to 32K, all issues are gone, even with the default 
> timeo=7. And nfsroot=xxx client responsiveness is a whole lot better.
> 
> I think when nfsmount was initially written, the default rsize/wsize 
> were much lower, which matched the timeo=7.
> 
> Now they cause the lags/timeouts that I reported.
> 
> So please, instead of increasing timeo, decrease the default rsize/wsize 
> to 32K.
> (or, where does that 1 MB default come from, so that I report it there...)
> 
> Thanks,
> Alkis
> 
> On 9/15/19 9:48 AM, Alkis Georgopoulos wrote:
>> I can't explain why 700 msecs aren't enough to avoid timeouts in 100 
>> Mbps networks, but my tests verify it, so I'm writing to the list to 
>> request that you increase the default timeo to at least 30, or to 600 
>> which is the default for `mount -t nfs`.
>>
>> How to reproduce:
>>
>> 1) Cabling:
>> server <=> 100 Mbps switch <=> client
>>
>> Alternatively, one can use a 1000 Mbps switch and this command:
>> ethtool -s enp3s0 speed 100 duplex full autoneg on
>>
>> 2) Server:
>> apt install nfs-kernel-server
>> echo '/srv *(ro,async,no_subtree_check)' >> /etc/exports
>> exportfs -ra
>> truncate -s 10G /srv/10G.file
>> The sparse file ensures that disk IO bandwidth isn't an issue.
>>
>> 3) Client:
>> /usr/lib/klibc/nfsmount -o timeo=7 192.168.1.112:/srv /mnt
>> dd if=/mnt/10G.file of=/dev/null status=progress
>>
>> 4) Result:
>> dd there starts with 11.2 MB/sec, which is fine/expected,
>> and it slowly drops to 2 MB/sec after a while,
>> it lags, omitting some seconds in its output line,
>> e.g. 507510784 bytes (508 MB, 484 MiB) copied, 186 s, 2,7 MB/s^C,
>> at which point "Ctrl+C" needs 30+ seconds to stop dd,
>> because of IO waiting etc.
>>
>> In another terminal tab, `dmesg -w` is full of these:
>> [  316.404250] nfs: server 192.168.1.112 not responding, still trying
>> [  316.759512] nfs: server 192.168.1.112 OK
>>
>> By using the NFS mount command defaults, timeo=600 and retrans=2, dd 
>> is constantly at 11.2 MB/sec, Ctrl+C is instant, and there's nothing 
>> in dmesg.
>>
>> It is entirely possible that timeo=7 should be enough and I bumped 
>> into an NFS bug, but I'm not experienced enough to troubleshoot it 
>> more without help.
>>
>> If anyone can make timeo=7 work properly in 100 Mbps networks in any 
>> distribution/version, please tell me to test with that.
>> I was testing with Ubuntu 18.04.3, kernel 4.15.
>>
>> Kind regards,
>> Alkis Georgopoulos
> 



More information about the klibc mailing list