我在程序执行过程中发现了很多打开文件过多的异常。通常,它们以以下形式出现:
org.jboss.netty.channel.ChannelException: Failed to create a selector.
...
Caused by: java.io.IOException: Too many open files
但是,这些并不是唯一的异常(exception)。我观察到类似的情况(由“打开的文件太多”引起),但频率要低得多。
奇怪的是,我将屏幕 session (从我启动程序的位置)打开文件的限制设置为 1M:
root@s11:~/fabiim-cbench# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) 16382
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
**open files (-n) 1000000**
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
此外,根据 lsof -p 的输出观察,在抛出异常之前,我没有看到超过 1111 个打开的文件(套接字、管道、文件)。
问题:出了什么问题和/或我如何更深入地研究这个问题。
额外:我目前正在集成Floodlight与 bft-smart 。简而言之,Floodlight 进程是在执行基准程序启动的压力测试时因打开文件过多而崩溃的进程。该基准测试程序将维护与 Floodlight 进程的 64 个 tcp 连接,而该进程又应维护与 bft-smart 副本的至少 64 * 3 个 tcp 连接。两个程序都使用netty来管理这些连接。
最佳答案
首先要检查的是,您能否从 Java 进程内部运行 ulimit
以确保文件限制与内部相同?像这样的代码应该可以工作:
InputStream is = Runtime.getRuntime().exec(new String[] {"bash", "-c", "ulimit -a"}).getInputStream();
int c;
while ((c = is.read()) != -1) {
System.out.write(c);
}
如果限制仍然显示 100 万,那么您就需要进行一些硬调试了。
如果我必须对此进行调试,我会研究以下几件事 -
tcp
端口号是否已用完?当您遇到此错误时,netstat -an
显示什么?使用
strace
准确找出哪个系统调用和哪些参数导致抛出此错误。EMFILE
的返回值是 24 .“打开的文件太多”
EMFILE
错误实际上可能由许多不同的系统调用因多种不同的原因而引发:$ cd /usr/share/man/man2 $ zgrep -A 2 EMFILE * accept.2.gz:.B EMFILE accept.2.gz:The per-process limit of open file descriptors has been reached. accept.2.gz:.TP accept.2.gz:-- accept.2.gz:.\" EAGAIN, EBADF, ECONNABORTED, EINTR, EINVAL, EMFILE, accept.2.gz:.\" ENFILE, ENOBUFS, ENOMEM, ENOTSOCK, EOPNOTSUPP, EPROTO, EWOULDBLOCK. accept.2.gz:.\" In addition, SUSv2 documents EFAULT and ENOSR. dup.2.gz:.B EMFILE dup.2.gz:The process already has the maximum number of file dup.2.gz:descriptors open and tried to open a new one. epoll_create.2.gz:.B EMFILE epoll_create.2.gz:The per-user limit on the number of epoll instances imposed by epoll_create.2.gz:.I /proc/sys/fs/epoll/max_user_instances eventfd.2.gz:.B EMFILE eventfd.2.gz:The per-process limit on open file descriptors has been reached. eventfd.2.gz:.TP execve.2.gz:.B EMFILE execve.2.gz:The process has the maximum number of files open. execve.2.gz:.TP execve.2.gz:-- execve.2.gz:.\" document ETXTBSY, EPERM, EFAULT, ELOOP, EIO, ENFILE, EMFILE, EINVAL, execve.2.gz:.\" EISDIR or ELIBBAD error conditions. execve.2.gz:.SH NOTES fcntl.2.gz:.B EMFILE fcntl.2.gz:For fcntl.2.gz:.BR F_DUPFD , getrlimit.2.gz:.BR EMFILE . getrlimit.2.gz:(Historically, this limit was named getrlimit.2.gz:.B RLIMIT_OFILE inotify_init.2.gz:.B EMFILE inotify_init.2.gz:The user limit on the total number of inotify instances has been reached. inotify_init.2.gz:.TP mmap.2.gz:.\" SUSv2 documents additional error codes EMFILE and EOVERFLOW. mmap.2.gz:.SH AVAILABILITY mmap.2.gz:On POSIX systems on which mount.2.gz:.B EMFILE mount.2.gz:(In case no block device is required:) mount.2.gz:Table of dummy devices is full. open.2.gz:.B EMFILE open.2.gz:The process already has the maximum number of files open. open.2.gz:.TP pipe.2.gz:.B EMFILE pipe.2.gz:Too many file descriptors are in use by the process. pipe.2.gz:.TP shmop.2.gz:.\" SVr4 documents an additional error condition EMFILE. shmop.2.gz: shmop.2.gz:In SVID 3 (or perhaps earlier) signalfd.2.gz:.B EMFILE signalfd.2.gz:The per-process limit of open file descriptors has been reached. signalfd.2.gz:.TP socket.2.gz:.B EMFILE socket.2.gz:Process file table overflow. socket.2.gz:.TP socketpair.2.gz:.B EMFILE socketpair.2.gz:Too many descriptors are in use by this process. socketpair.2.gz:.TP spu_create.2.gz:.B EMFILE spu_create.2.gz:The process has reached its maximum open files limit. spu_create.2.gz:.TP timerfd_create.2.gz:.B EMFILE timerfd_create.2.gz:The per-process limit of open file descriptors has been reached. timerfd_create.2.gz:.TP truncate.2.gz:.\" error conditions EMFILE, EMULTIHP, ENFILE, ENOLINK. SVr4 documents for truncate.2.gz:.\" .BR ftruncate () truncate.2.gz:.\" an additional EAGAIN error condition.
如果您手动查看所有这些联机帮助页,您可能会发现一些有趣的东西。例如,我认为有趣的是,NIO channel 使用的底层系统调用
epoll_create
将返回EMFILE
“打开文件过多”,如果The per-user limit on the number of epoll instances imposed by /proc/sys/fs/epoll/max_user_instances was encountered. See epoll(7) for further details.
现在该文件名实际上并不存在于我的系统上,但是在
/proc/sys/fs/epoll
和/proc/sys/fs/中的文件中定义了一些限制inotify
您可能会遇到这种情况,特别是当您在同一台计算机上运行同一测试的多个实例时。弄清楚情况是否如此本身就是一件苦差事 - 您可以从检查系统日志中是否有任何消息开始......
祝你好运!
关于java - "unlimited"系统下打开文件过多异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22355207/