Troubleshooting large file transfers across wide area networks: Difference between revisions

From Public wiki of Kevin P. Inscoe
Jump to navigation Jump to search
No edit summary
No edit summary
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
==Flow control troubleshooting==
==Flow control troubleshooting==


===Solaris===
==Solaris==


http://www.princeton.edu/~unix/Solaris/troubleshoot/netstat.html
http://www.princeton.edu/~unix/Solaris/troubleshoot/netstat.html
Line 32: Line 32:
</pre>
</pre>


====Flow Control monitoring====
===Flow Control monitoring===


<pre>
<pre>
Line 39: Line 39:
</pre>
</pre>


====Improving flow control====
===Improving flow control===


http://www.psc.edu/networking/projects/tcptune/#Solaris
http://www.psc.edu/networking/projects/tcptune/#Solaris
Line 168: Line 168:
https://plone3.fnal.gov/P0/WAN/netperf/methodology/
https://plone3.fnal.gov/P0/WAN/netperf/methodology/


Attach:TCP-Tuning-Tutorial.pdf
[[File:TCP-Tuning-Tutorial.pdf]]


http://fasterdata.es.net/TCP-tuning//tools.html
http://fasterdata.es.net/TCP-tuning//tools.html

Latest revision as of 01:17, 30 January 2018

Flow control troubleshooting

Solaris

http://www.princeton.edu/~unix/Solaris/troubleshoot/netstat.html

 # netstat -i
 Name  Mtu  Net/Dest      Address        Ipkts  Ierrs Opkts  Oerrs Collis Queue 
 lo0   8232 localhost     localhost      11982693 0     11982693 0     0      0     
 nxge1 1500 wdcdvuv10-adm.hmco.com wdcdvuv10-adm.hmco.com 129186999 0  258305840 0     0   0     
 nxge2 1500 wdcdvuv10-35.hmco.com wdcdvuv10-35.hmco.com 1116133305 0     1099063045 0  0  0     
 nxge130000 1500 wdcdvuv10-130 wdcdvuv10-130  247847548 0     180990116 0     0      0     
 nxge132000 1500 wdcdvuv10.hmco.com wdcdvuv10.hmco.com 247847548 0     180990116 0     0      0

Check only the following columns: <Input errors> <Output errors> <Collisions> <Queue length>\\ Run on wdcdvuv10 (global zone) to check interface for zones

 # netstat -i | grep nxge2 | awk '{ print $6 " " $8 " " $9 " " $10 }'
 0 0 0 0

Should all be zero if all is working well.

http://www.brendangregg.com/Perf/network.html

 # kstat -p | grep nxge2
 # kstat -p | grep "tcp:"

Flow Control monitoring

 # date; kstat -p 'tcp:0:tcpstat:tcp_flwctl_on'            
 tcp:0:tcpstat:tcp_flwctl_on     16330314

Improving flow control

http://www.psc.edu/networking/projects/tcptune/#Solaris

 # ndd /dev/tcp tcp_wscale_always  #(should be 1)
 # ndd /dev/tcp tcp_tstamp_if_wscale  #(should be 1)
 # ndd /dev/tcp tcp_sack_permitted  #(should be 2)

In Solaris 10: Use "ndd "set /dev/tcp tcp_wscale_always 1" This should be enabled by default.

Set/get the maximum (send or receive) TCP buffer size an application can request:

 # ndd -get /dev/tcp tcp_max_buf
 4000000
 # ndd -set /dev/tcp tcp_max_buf 4000000

Set/get the maximum congestion window:

 # ndd -get /dev/tcp tcp_cwnd_max
 4000000
 # ndd -set /dev/tcp tcp_cwnd_max 4000000

Set/get the default send and receive buffer sizes:

 # ndd -get /dev/tcp tcp_xmit_hiwat
 4000000
 # ndd -get /dev/tcp tcp_recv_hiwat
 4000000
 # ndd -set /dev/tcp tcp_xmit_hiwat 4000000
 # ndd -set /dev/tcp tcp_recv_hiwat 4000000

Update: 2010-11-10 7:45am ET:

I have now upped them to 1GB (temporarily) to see if this helps any:

 inscoek@wdcdvuw15:/home/inscoek> ./get_buffers.sh 
 Current TCP buffer settings on abcdvuw15 at Wednesday, November 10, 2010  7:48:26 AM EST
 
 tcp_max_buf:
 1000000000
 tcp_cwnd_max:
 1000000000
 tcp_xmit_hiwat:
 1000000000
 tcp_recv_hiwat
 1000000000

http://unix.derkeiler.com/Newsgroups/comp.unix.solaris/2006-05/msg01029.html

on the 6800...

 # route change <route-to-15K> -sendpipe 403456
 # route change <route-to-15K> -recvpipe 403456
 # route get <route-to-15K>

on the 15K...

 # route change <route-to-6800> -sendpipe 403456
 # route change <route-to-6800> -recvpipe 403456
 # route get <route-to-6800>

Calculate the window more or less as:

 (44*1024*1024*.070)/8 = 403456
 ------------ --- - ------
 | | | |
 | | | ------- window size in bytes
 | | ------------- bits/byte
 | ----------------- delay (rtt)
 --------------------------- ~bandwidth of ds3 (bits/sec)

(don't know if bandwidth should be expressed as powers of 2 or 10) but 44Mb/s is the right ball park for ds3. Perhaps someone else can clarify.

check the values for these using ndd.

 /dev/tcp tcp_max_buf
 /dev/tcp tcp_wscale_always
 /dev/tcp tcp_tstamp_if_wscale
 /dev/tcp tcp_deferred_acks_max
 /dev/tcp tcp_compression_enabled

Window scaling is negotiated only in the 3-way handshake so far as I know. The window size may change, but scaling should not once the connection is established. you might try increasing tcp_deferred_acks_max in case exceeding this causes the window to shrink. Also just an sysadmin. Good luck.

Tools

TCP Windows calculator - http://www.kehlet.cx/docs/tcpwin.php

Further reading

http://www.solarisinternals.com/wiki/index.php/Networks

http://sunaytripathi.wordpress.com/2010/03/25/solaris-10-networking-the-magic-revealed/

File:Tcp-wan-perf.pdf

http://www.psc.edu/networking/projects/tcptune/

http://fasterdata.es.net/TCP-tuning//

http://onlamp.com/pub/a/onlamp/2005/11/17/tcp_tuning.html

http://fasterdata.es.net/TCP-tuning//troubleshooting.html

https://plone3.fnal.gov/P0/WAN/netperf/methodology/

File:TCP-Tuning-Tutorial.pdf

http://fasterdata.es.net/TCP-tuning//tools.html

http://docs.sun.com/app/docs/doc/817-0404/appendixa-28?a=view

http://docs.sun.com/app/docs/doc/817-0404/chapter4-1?l=en&a=view

http://www.sean.de/Solaris/soltune.html

http://www.sean.de/Solaris/soltune.html#water

http://shlang.com/writing/tcp-perf.html

http://fasterdata.es.net/TCP-tuning/

http://blogs.sun.com/stw/entry/solaris_zones_and_networking_common

http://www.sun.com/bigadmin/content/networkperf/index.jsp

http://www.brandonhutchinson.com/Solaris_NIC_speed_and_duplex_settings.html

http://hub.opensolaris.org/bin/view/Project+nemo/WebHome

http://www.sun.com/bigadmin/sundocs/articles/nicddladmconf.jsp

http://books.google.com/books?id=L4oyNrsFBbsC&lpg=PT262&ots=jl2ki1e4zp&dq=solaris%20window%20scaling&pg=PT262#v=onepage&q=solaris%20window%20scaling&f=false

http://www.kehlet.cx/articles/99.html

http://web.archive.org/web/20080803082218/http://dast.nlanr.net/Guides/GettingStarted/TCP_window_size.html

http://www.kehlet.cx/articles/99.html

http://www.cisco.com/en/US/tech/tk870/tk877/tk880/technologies_tech_note09186a008011a218.shtml

http://www.informit.com/articles/article.aspx?p=174099&seqNum=3

http://sunsite.uakom.sk/sunworldonline/swol-12-1996/swol-12-perf.html