Optimal file server settings?

Status
Not open for further replies.

brez

Member
5
2012
0
0
Hi guys

I've recently aquired a new file server, it has 8GB ram, quad core xeon and 21x2TB sata2 raid10, 2gbit unmetered

I feel a little ripped off as it really cannot cope with the traffic

I'm running nginx to serve the static files (from 1MB to 6GB) with 10 workers and lunix-aio enabled

at busy times the loads averages are sky high and it really struggles, and if you start writing to the disks well all hell breaks loose and it hangs until writing is complete (I need writing as it is for uploads too, not very many though, say 10-20 a day tops)

It is not a bandwidth issue as I can download off the OS disks at full speed. Im at wits end with this one as it has cost me an absolute fortune.

Thanks in advance for any help and I hope to help the community with what I know in the future :)
 
3 comments
Looks like Disk I/O issue.Few questions

1) All drives in single Raid10 ?
2) Which H/W Raid Controller is used ?
3) which caching method on Controller. Write back or Write through ?
4) What strip size used ?
5) Do you have millions of small files on it ?
6) How much nginx connections at peek time ?
7) What is processor model number ?
8) Could you please post your niginx config and screen of top command
 
Thanks for your reply

1) all drives for the files in hardware raid10. It is a supermicro server with its own disks for operating system etc, the 24x2TB drives are seperate (It's one of those leaseweb high storage supermicro servers)
2)I'm not sure
3)Again, clueless
4) It was all done by the guys at the datacenter - How do I check?
5) no, around 50,000 small to large files
6) it can go up to around 1500, maybe 2000+ on a sunday evening
7) Intel(R) Xeon(R) CPU L5410 @ 2.33GHz, 4 cores (Not great I know)
8)
The main bits (Rest is just the http secure link configs etc)
Code:
worker_processes  14;


events {
    worker_connections  10000;
}


sendfile off;
aio on;
output_buffers 1 2m;
keepalive_timeout 15;
send_timeout 30s;
tcp_nopush on;
tcp_nodelay on;
gzip on;
client_body_temp_path /files/temp 1 2;
client_max_body_size 102400m;
reset_timedout_connection on;
server_names_hash_bucket_size 512;
Code:
top - 14:28:16 up 24 days, 21:08,  1 user,  load average: 8.60, 8.58, 8.31
Tasks: 172 total,   1 running, 170 sleeping,   0 stopped,   1 zombie
Cpu(s):  0.2%us,  9.1%sy,  0.0%ni, 37.8%id, 47.4%wa,  0.0%hi,  5.4%si,  0.0%st
Mem:   8059092k total,  7889520k used,   169572k free,     5772k buffers
Swap:  4194288k total,        0k used,  4194288k free,  4611136k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    4 root      20   0     0    0    0 S 12.6  0.0   3050:14 ksoftirqd/0
    9 root      20   0     0    0    0 S  6.6  0.0   1700:37 ksoftirqd/1
17956 nobody    20   0  191m 159m  644 S  4.0  2.0 277:32.50 nginx
17959 nobody    20   0  258m 226m  636 D  4.0  2.9 277:37.36 nginx
17963 nobody    20   0  193m 161m  640 S  4.0  2.0 275:23.69 nginx
17958 nobody    20   0  209m 177m  640 D  3.7  2.3 277:53.59 nginx
17960 nobody    20   0  254m 222m  636 S  3.7  2.8 278:06.80 nginx
   58 root      20   0     0    0    0 S  3.3  0.0 558:27.60 kswapd0
17957 nobody    20   0  223m 191m  644 D  3.3  2.4 277:42.72 nginx
17961 nobody    20   0  223m 190m  640 S  3.3  2.4 277:48.09 nginx
17964 nobody    20   0  257m 224m  636 S  3.3  2.9 276:32.00 nginx
17965 nobody    20   0  283m 251m  636 D  3.3  3.2 277:54.33 nginx
17955 nobody    20   0  233m 201m  636 D  3.0  2.6 278:29.51 nginx
17962 nobody    20   0  240m 208m  640 S  3.0  2.6 278:03.51 nginx
17968 nobody    20   0  201m 168m  640 D  3.0  2.1 279:19.58 nginx
17967 nobody    20   0  199m 167m  640 S  2.3  2.1 274:08.11 nginx
17966 nobody    20   0  233m 200m  640 S  2.0  2.6 277:36.16 nginx
iotop

Code:
Total DISK READ: 141.82 M/s | Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
17956 be/4 nobody     16.02 M/s    0.00 B/s  0.00 % 75.19 % nginx: wo~er process
17968 be/4 nobody     16.02 M/s    0.00 B/s  0.00 % 61.14 % nginx: wo~er process
17958 be/4 nobody     12.72 M/s    0.00 B/s  0.00 % 54.23 % nginx: wo~er process
17964 be/4 nobody     14.72 M/s    0.00 B/s  0.00 % 51.75 % nginx: wo~er process
17955 be/4 nobody     12.40 M/s    0.00 B/s  0.00 % 50.19 % nginx: wo~er process
17957 be/4 nobody     11.25 M/s    0.00 B/s  0.00 % 39.42 % nginx: wo~er process
17963 be/4 nobody     12.72 M/s    0.00 B/s  0.00 % 37.81 % nginx: wo~er process
17960 be/4 nobody      9.78 M/s    0.00 B/s  0.00 % 37.67 % nginx: wo~er process
17966 be/4 nobody      8.25 M/s    3.77 K/s  0.00 % 33.96 % nginx: wo~er process
17962 be/4 nobody      7.42 M/s    0.00 B/s  0.00 % 30.26 % nginx: wo~er process
17961 be/4 nobody      7.30 M/s    0.00 B/s  0.00 % 27.47 % nginx: wo~er process
17959 be/4 nobody      3.59 M/s    0.00 B/s  0.00 % 24.07 % nginx: wo~er process
17965 be/4 nobody      5.42 M/s    0.00 B/s  0.00 % 17.17 % nginx: wo~er process
17967 be/4 nobody      3.83 M/s    0.00 B/s  0.00 % 14.72 % nginx: wo~er process
 1603 be/4 root      403.15 K/s    0.00 B/s  0.00 %  3.96 % python /u~/bin/iotop
20756 be/4 apache      0.00 B/s    3.77 K/s  0.00 %  0.00 % httpd
    1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
    2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kthreadd]
    3 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/0]
    4 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/0]
    5 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/0]
    6 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [watchdog/0]
This is currently on low to medium traffic time

Thanks for your help mate!
 
few things you should understand

1) Contact Leaseweb and ask which controller is used and if they have BBU and write back enabled.If its a poor card that affect disk performance

2) Your processor is Intel Xeon E5410 @ 2.33GHz . Which is some kind of old processor. Actually the server config looks like just for storing files not best for file sharing server.

3) check free -m result and make sure your are not using the swap.

4) you have enabled aio. I heard that support for aio on linux kernel is not good.

5) from top result we can see there is 45% CPU wait for i/o .

6) You should try decreasing worker process number to 4 .There is no point in running 14 workers on a old quad core processor.

7) turn off gzip on nginx config. this will greatly reduce cpu hogging.
 
Status
Not open for further replies.
Back
Top