c# - 引发异常时,带有Lazy <T>的StackOverflowException

标签 c#

一个非常简单的示例应用程序(.NET 4.6.2)在的递归深度为12737 时生成StackOverflowException,如果最内部函数调用引发异常,则递减深度为 10243 ,这是可以预期的,并且可以。

如果我使用Lazy<T>暂时保留中间结果,则如果未引发异常,则递归深度为 2207 ,如果已引发异常,则发生递归深度为 105

注意:深度为 105 的StackOverflowException仅在编译为x64时才可观察到。使用x86(32位)时,效果首先出现在 4272 深度处。 Mono(就像https://repl.it所使用的一样)可以在深度达到74200 的情况下正常工作。

StackOverflowException不会在深度递归内发生,而是在升回到主例程时发生。对finally块进行一定深度的处理,然后程序死亡:

Exception System.InvalidOperationException at 105
Finally at 105
...
Exception System.InvalidOperationException at 55
Finally at 55
Exception System.InvalidOperationException at 54
Finally at 54
Process is terminated due to StackOverflowException.

或在调试器中:
The program '[xxxxx] Test.vshost.exe' has exited with code -2147023895 (0x800703e9).

谁能解释这个?
public class Program
{
    private class Test
    {
        private int maxDepth;

        private int CalculateWithLazy(int depth)
        {
            try
            {
                var lazy = new Lazy<int>(() => this.Calculate(depth));
                return lazy.Value;
            }  
            catch (Exception e)
            {
                Console.WriteLine("Exception " + e.GetType() + " at " + depth);
                throw;
            }
            finally
            {
                Console.WriteLine("Finally at " + depth);
            }
        }

        private int Calculate(int depth)
        {
            if (depth >= this.maxDepth) throw new InvalidOperationException("Max. recursion depth reached.");
            return this.CalculateWithLazy(depth + 1);
        }

        public void Run()
        {
            for (int i = 1; i < 100000; i++)
            {
                this.maxDepth = i;

                try
                {
                    Console.WriteLine("MaxDepth: " + i);
                    this.CalculateWithLazy(0);

                }
                catch { /* ignore */ }
            }
        }
    }

    public static void Main(string[] args)
    {
        var test = new Test();
        test.Run();
        Console.Read();
    }

更新:只需使用递归方法中的try-catch-throw块,就可以在不使用Lazy<T>的情况下重现该问题。
        [MethodImpl(MethodImplOptions.NoInlining)]
        private int Calculate(int depth)
        {
            try
            {
                if (depth >= this.maxDepth) throw new InvalidOperationException("Max. recursion depth reached.");
                return this.Calculate2(depth + 1);
            }
            catch
            {
                throw;
            }
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        private int Calculate2(int depth) // just to prevent the compiler from tail-recursion-optimization
        {
            return this.Calculate(depth);
        }

        public void Run()
        {
            for (int i = 1; i < 100000; i++)
            {
                this.maxDepth = i;

                try
                {
                    Console.WriteLine("MaxDepth: " + i);
                    this.Calculate(0);

                }
                catch(Exception e)
                {
                    Console.WriteLine("Finished with " + e.GetType());
                }
            }
        }

最佳答案

The problem can be reproduced without the usage of Lazy<T>, just by having a try-catch-throw block in the recursive method.


您已经注意到了行为的根源。现在让我解释一下为什么没有意义,对吧?
这没有任何意义,因为捕获了异常然后立即将其重新抛出,因此堆栈应该缩小,对吗?
以下测试:
internal class Program
{
    private int _maxDepth;

    [MethodImpl(MethodImplOptions.NoInlining)]
    private int Calculate(int depth)
    {
        try
        {
            Console.WriteLine("In try at depth {0}: stack frame count = {1}", depth, new StackTrace().FrameCount);

            if (depth >= _maxDepth)
                throw new InvalidOperationException("Max. recursion depth reached.");

            return Calculate2(depth + 1);
        }
        catch
        {
            Console.WriteLine("In catch at depth {0}: stack frame count = {1}", depth, new StackTrace().FrameCount);
            throw;
        }
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private int Calculate2(int depth) => Calculate(depth);

    public void Run()
    {
        try
        {
            _maxDepth = 10;
            Calculate(0);
        }
        catch (Exception e)
        {
            Console.WriteLine("Finished with " + e.GetType());
        }
    }

    public static void Main() => new Program().Run();
}
似乎可以验证该假设:
In try at depth 0: stack frame count = 3
In try at depth 1: stack frame count = 5
In try at depth 2: stack frame count = 7
In try at depth 3: stack frame count = 9
In try at depth 4: stack frame count = 11
In try at depth 5: stack frame count = 13
In try at depth 6: stack frame count = 15
In try at depth 7: stack frame count = 17
In try at depth 8: stack frame count = 19
In try at depth 9: stack frame count = 21
In try at depth 10: stack frame count = 23
In catch at depth 10: stack frame count = 23
In catch at depth 9: stack frame count = 21
In catch at depth 8: stack frame count = 19
In catch at depth 7: stack frame count = 17
In catch at depth 6: stack frame count = 15
In catch at depth 5: stack frame count = 13
In catch at depth 4: stack frame count = 11
In catch at depth 3: stack frame count = 9
In catch at depth 2: stack frame count = 7
In catch at depth 1: stack frame count = 5
In catch at depth 0: stack frame count = 3
Finished with System.InvalidOperationException
好吧...不这不是那么简单。

.NET异常建立在Windows Structured Exception Handling (SEH)之上,这是一个复杂的野兽。
如果您想了解详细信息,则需要阅读两篇文章。它们很旧,但是与您的问题相关的部分仍然准确:
  • The Exception Model(在CLR中)
  • A Crash Course on the Depths of Win32™ Structured Exception Handling

  • 但是,让我们集中讨论即将解决的问题。
    这是第一个说解开堆栈(强调我的)时发生的情况:

    The other form of unwind is the actual popping of the CPU stack. This doesn’t happen as eagerly as the popping of the SEH records. On X86, EBP is used as the frame pointer for methods containing SEH. ESP points to the top of the stack, as always. Until the stack is actually unwound, all the handlers are executed on top of the faulting exception frame. So the stack actually grows when a handler is called for the first or second pass. EBP is set to the frame of the method containing a filter or finally clause so that local variables of that method will be in scope.

    The actual popping of the stack doesn’t occur until the catching ‘except’ clause is executed.


    让我们修改我们以前的测试程序来检查一下:
    internal class Program
    {
        private int _maxDepth;
        private UIntPtr _stackStart;
    
        [MethodImpl(MethodImplOptions.NoInlining)]
        private int Calculate(int depth)
        {
            try
            {
                Console.WriteLine("In try at depth {0}: stack frame count = {1}, stack offset = {2}",depth, new StackTrace().FrameCount, GetLooseStackOffset());
    
                if (depth >= _maxDepth)
                    throw new InvalidOperationException("Max. recursion depth reached.");
    
                return Calculate2(depth + 1);
            }
            catch
            {
                Console.WriteLine("In catch at depth {0}: stack frame count = {1}, stack offset = {2}", depth, new StackTrace().FrameCount, GetLooseStackOffset());
                throw;
            }
        }
    
        [MethodImpl(MethodImplOptions.NoInlining)]
        private int Calculate2(int depth) => Calculate(depth);
    
        public void Run()
        {
            try
            {
                _stackStart = GetSomePointerNearTheTopOfTheStack();
                _maxDepth = 10;
                Calculate(0);
            }
            catch (Exception e)
            {
                Console.WriteLine("Finished with " + e.GetType());
            }
        }
    
        [MethodImpl(MethodImplOptions.NoInlining)]
        private static unsafe UIntPtr GetSomePointerNearTheTopOfTheStack()
        {
            int dummy;
            return new UIntPtr(&dummy);
        }
    
        private int GetLooseStackOffset() => (int)((ulong)_stackStart - (ulong)GetSomePointerNearTheTopOfTheStack());
    
        public static void Main() => new Program().Run();
    }
    
    结果如下:
    In try at depth 0: stack frame count = 3, stack offset = 384
    In try at depth 1: stack frame count = 5, stack offset = 752
    In try at depth 2: stack frame count = 7, stack offset = 1120
    In try at depth 3: stack frame count = 9, stack offset = 1488
    In try at depth 4: stack frame count = 11, stack offset = 1856
    In try at depth 5: stack frame count = 13, stack offset = 2224
    In try at depth 6: stack frame count = 15, stack offset = 2592
    In try at depth 7: stack frame count = 17, stack offset = 2960
    In try at depth 8: stack frame count = 19, stack offset = 3328
    In try at depth 9: stack frame count = 21, stack offset = 3696
    In try at depth 10: stack frame count = 23, stack offset = 4064
    In catch at depth 10: stack frame count = 23, stack offset = 13024
    In catch at depth 9: stack frame count = 21, stack offset = 21888
    In catch at depth 8: stack frame count = 19, stack offset = 30752
    In catch at depth 7: stack frame count = 17, stack offset = 39616
    In catch at depth 6: stack frame count = 15, stack offset = 48480
    In catch at depth 5: stack frame count = 13, stack offset = 57344
    In catch at depth 4: stack frame count = 11, stack offset = 66208
    In catch at depth 3: stack frame count = 9, stack offset = 75072
    In catch at depth 2: stack frame count = 7, stack offset = 83936
    In catch at depth 1: stack frame count = 5, stack offset = 92800
    In catch at depth 0: stack frame count = 3, stack offset = 101664
    Finished with System.InvalidOperationException
    
    哎呀。是的,当我们处理异常时,堆栈实际上会增长。
    _maxDepth = 1000处,执行在以下位置结束:
    In catch at depth 933: stack frame count = 1869, stack offset = 971232
    In catch at depth 932: stack frame count = 1867, stack offset = 980096
    In catch at depth 931: stack frame count = 1865, stack offset = 988960
    In catch at depth 930: stack frame count = 1863, stack offset = 997824
    
    Process is terminated due to StackOverflowException.
    
    因此,根据我们自己的代码,大约997824字节的已使用堆栈空间非常接近Windows上1 MB的默认线程堆栈大小。调用CLR代码应弥补差异。

    您可能知道,SEH异常通过两个过程处理:
  • 第一遍(过滤)查找能够处理该异常的第一个异常处理程序。在C#中,这基本上检查catch子句是否与正确的异常类型匹配,并在存在的情况下执行whencatch (...) when (...)部分。
  • 第二遍(展开)实际上处理了异常。

  • 这是第二篇文章在展开过程中所说的:

    When an exception occurs, the system walks the list of EXCEPTION_REGISTRATION structures until it finds a handler for the exception. Once a handler is found, the system walks the list again, up to the node that will handle the exception. During this second traversal, the system calls each handler function a second time. The key distinction is that in the second call, the value 2 is set in the exception flags. This value corresponds to EH_UNWINDING.

    [...]

    After an exception is handled and all the previous exception frames have been called to unwind, execution continues wherever the handling callback decides.


    这仅证实了第一篇文章的内容。
    第一遍需要保留故障堆栈,以便能够检查其状态,并能够继续执行该故障指令(是的,这很重要,它是非常低级的,但是您可以修补错误原因并重新开始执行,就好像没有错误一样。
    除了处理程序现在获得EH_UNWINDING标志之外,第二遍的实现与第一遍一样。这意味着有问题的堆栈仍会保留在该点,直到最终处理程序决定从何处恢复执行为止。

    The stack pointer moves 36 bytes for a 32-Bit process, but whopping 8976 bytes for a 64-bit process here while unwinding the stack. What's the explanation for this?


    好问题!
    这是因为32位和64位SEH完全不同。这是some reading material(重点是我的):

    Because on the x86 each function that uses SEH has this aforementioned construct as part of its prolog, the x86 is said to use frame based exception handling. There are a couple of problems with this approach:

    • Because the exception information is stored on the stack, it is susceptible to buffer overflow attacks.
    • Overhead. Exceptions are, well, exceptional, which means the exception will not occur in the common case. Regardless, every time a function is entered that uses SEH, these extra instructions are executed.

    Because the x64 was a chance to do away with a lot of the cruft that had been hanging around for decades, SEH got an overhaul that addressed both issues mentioned above. On the x64, SEH has become table-based, which means when the source code is compiled, a table is created that fully describes all the exception handling code within the module. This table is then stored as part of the PE header. If an exception occurs, the exception table is parsed by Windows to find the appropriate exception handler to execute. Because exception handling information is tucked safely away in the PE header, it is no longer susceptible to buffer overflow attacks. In addition, because the exception table is generated as part of the compilation process, no runtime overhead (in the form of push and pop instructions) is incurred during normal processing.

    Of course, table-based exception handling schemes have a couple of negative aspects of their own. For example, table-based schemes tend to take more space in memory than stack-based schemes. Also, while overhead in the normal execution path is reduced, the overhead it takes to process an exception is significantly higher than in frame-based approaches. Like everything in life, there are trade-offs to consider when evaluating whether the table-based or a frame-based approach to exception handling is "best."


    简而言之,快乐路径已在x64中进行了优化,而异常路径已变得更慢。如果你问我,那是一件非常好的事。
    这是我之前链接的第一篇文章的另一篇引文:

    Both IA64 and AMD64 have a model for exception handling that avoids reliance on an explicit handler chain that starts in TLS and is threaded through the stack. Instead, exception handling relies on the fact that on 64-bit systems we can perfectly unwind a stack. And this ability is itself due to the fact that these chips are severely constrained on the calling conventions they support.

    [...]

    Anyway, on 64-bit systems the correspondence between an activation record on the stack and the exception record that applies to it is not achieved through an FS:[0] chain. Instead, unwinding of the stack reveals the code addresses that correspond to a particular activation record. These instruction pointers of the method are looked up in a table to find out whether there are any__try/__except/__finally clauses that cover these code addresses. This table also indicates how to proceed with the unwind by describing the actions of the method epilog.


    是的。完全不同的方法。
    但是,让我们看一下x64调用堆栈,以了解实际使用堆栈空间的位置。我修改了Calculate,如下所示:
    private int Calculate(int depth)
    {
        try
        {
            if (depth >= _maxDepth)
                throw new InvalidOperationException("Max. recursion depth reached.");
    
            return Calculate2(depth + 1);
        }
        catch
        {
            if (depth == _maxDepth)
            {
                Console.ReadLine();
            }
    
            throw;
        }
    }
    
    我在Console.ReadLine();上设置了一个断点,并查看了 native 调用堆栈以及每帧上的堆栈指针寄存器的值:
    native call stack
    地址fffffffffffffffe0000000000008000在我看来很奇怪,但是无论如何,这表明每帧消耗了多少堆栈空间。 Windows Native API(ntdll.dll)中发生了很多事情,CLR添加了一些东西。
    就内部Windows方面而言,我们是不走运的,因为源代码不是公开可用的。但是我们至少可以看一下clr.dll!ClrUnwindEx,因为该函数是quite small但使用了很多堆栈空间:
    void ClrUnwindEx(EXCEPTION_RECORD* pExceptionRecord, UINT_PTR ReturnValue, UINT_PTR TargetIP, UINT_PTR TargetFrameSp)
    {
        PVOID TargetFrame = (PVOID)TargetFrameSp;
    
        CONTEXT ctx;
        RtlUnwindEx(TargetFrame,
                    (PVOID)TargetIP,
                    pExceptionRecord,
                    (PVOID)ReturnValue, // ReturnValue
                    &ctx,
                    NULL);      // HistoryTable
    
        // doesn't return
        UNREACHABLE();
    }
    
    它在堆栈上定义了一个CONTEXT变量,即a large struct。我只能假设64位SEH函数将它们的堆栈空间用于相似的目的。
    现在,将其与32位调用堆栈进行比较:
    32-bit call stack
    如您所见,这与64位完全不同。
    出于好奇,我测试了一个简单的C++程序的行为:
    #include "stdafx.h"
    #include <iostream>
    
    __declspec(noinline) static char* GetSomePointerNearTheTopOfTheStack()
    {
        char dummy;
        return &dummy;
    }
    
    int main()
    {
        auto start = GetSomePointerNearTheTopOfTheStack();
    
        try
        {
            throw 42;
        }
        catch (...)
        {
            auto here = GetSomePointerNearTheTopOfTheStack();
            std::cout << "Difference in " << (sizeof(char*) * 8) << "-bit: " << (start - here) << std::endl;
        }
    
        return 0;
    }
    
    结果如下:
    Difference in 32-bit: 2224
    Difference in 64-bit: 9744
    
    这进一步表明,这不仅是CLR事情,而且还归因于基本的SEH实现差异。

    关于c# - 引发异常时,带有Lazy <T>的StackOverflowException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47367058/

    相关文章:

    c# - X509Certificate2.验证行为

    c# - 如何在 Visual Studio 中的 TextAdornment 模板的编辑器中插入文本?

    c# - 关于获取和设置的快速问题

    c# - 单独的辅助角色与 Windows 服务以及另一个角色

    用于桌面应用程序的 C# 与 Adob​​e Air

    c# - Android wear Xamarin 消息

    c# - Datagrid 组合框消失的内容 + 自动完成

    c# - 奇怪的 DLL + InterropServices 问题

    c# - 验证上下文始终为 NULL?

    c# - 以设计的形式或用户控件自定义处理