c# - XDocument.Validate 未捕获针对 XSD 的所有错误

标签 c# xml xsd

我在使用 C# XDocument.Validate 或具有所需配置的 XMLReaderSettings 针对有效 XSD 验证 XML 文档时遇到了一个非常奇怪的问题。问题是:当 XML 文档中存在错误时,验证过程无法捕获特定条件下的所有错误,我无法找到该异常的模式。

这是我的 XSD:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified"
			  targetNamespace="http://www.somesite.com/somefolder/messages"
			  xmlns:xs="http://www.w3.org/2001/XMLSchema">
   <xs:element name="Message">
    <xs:complexType>
     <xs:sequence>
      <xs:element name="Header">
         <xs:complexType>
          <xs:sequence>
           <xs:element name="MessageId" type="xs:string" />
           <xs:element name="MessageSource" type="xs:string" />
          </xs:sequence>
       </xs:complexType>
    </xs:element>
    <xs:element name="Body">
       <xs:complexType>
          <xs:sequence>
             <xs:element name="Abc001">
                <xs:complexType>
                   <xs:sequence>
                    <xs:element name="Abc002" type="xs:string" />
                    <xs:element name="Abc003" type="xs:string" minOccurs="0" />
                    <!--<xs:element name="Abc004" type="xs:string" />-->
                    <xs:element name="Abc004">
                       <xs:simpleType>
                         <xs:restriction base="xs:string">
                           <xs:maxLength value="200"/>
                         </xs:restriction>
                      </xs:simpleType>
                    </xs:element>
                      <xs:element name="Abc005">
                         <xs:complexType>
                            <xs:sequence>
                              <xs:element name="Abc006" type="xs:unsignedShort" />
                              <xs:element name="Abc007">
                                <xs:complexType>
                                  <xs:sequence>
                                    <xs:element name="Abc008" type="xs:string"/>
                                    <xs:element name="Abc009" type="xs:string" minOccurs="0"/>
                                    <xs:element name="Abc010" type="xs:string"/>
                                  </xs:sequence>
                                </xs:complexType>
                              </xs:element>
                              <xs:element name="Abc011" type="xs:date" />
                              <xs:element name="Abc012">
                                <xs:complexType>
                                  <xs:sequence>
                                    <xs:element name="Abc013" type="xs:string" />
                                    <xs:element name="Abc014" type="xs:string" />
                                  </xs:sequence>
                                </xs:complexType>
                              </xs:element>
                            </xs:sequence>
                         </xs:complexType>
                      </xs:element>
                   </xs:sequence>
                </xs:complexType>
             </xs:element>
          </xs:sequence>
       </xs:complexType>
    </xs:element>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
</xs:schema>

这里是针对此 XSD 验证的 XML 文档:

<?xml version="1.0" encoding="utf-8"?>
<Message xmlns="http://www.somesite.com/somefolder/messages">
	<Header>
		<MessageId>Lorem</MessageId>
		<MessageSource>Ipsum</MessageSource>
	</Header>
	<Body>
		<Abc001>
			<Abc002>dolor</Abc002>
			<Abc003>sit amet</Abc003>
			<Abc004>consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.</Abc004>
			<Abc005>
				<Abc006>1234</Abc006>
				<Abc007>
					<Abc008>Ut enim</Abc008>
					<Abc009>ad</Abc009>
					<Abc010>minim</Abc010>
				</Abc007>
				<Abc011>1982-10-17</Abc011>
				<Abc012>
					<Abc013>veniam</Abc013>
					<Abc014>nostrud</Abc014>
				</Abc012>
			</Abc005>
		</Abc001>
	</Body>
</Message>

现在,当我在 XML 中引入一些验证错误并根据 XSD 对其进行验证时,它确实按预期找到了所有错误。下面是容易出错的xml(我在引入错误的地方做了标注):

<?xml version="1.0" encoding="utf-8"?>
<Message xmlns="http://www.somesite.com/somefolder/messages">
	<Header>
		<MessageId>Lorem</MessageId>
		<MessageSource>Ipsum</MessageSource>
	</Header>
	<Body>
		<Abc001>
			<Abc002>dolor</Abc002>
			<Abc003>sit amet</Abc003>
			
			<!--The value for Abc004 is increased beyond the allowed 200 characters-->
			
			<Abc004>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</Abc004>
			<Abc005>
				<Abc006>1234</Abc006>
				<Abc007>
					<Abc008>Ut enim</Abc008>
					<ABC009>AD</ABC009>
					
					<!--<Abc010>minim</Abc010>  Required element removed-->
				</Abc007>
				
				<!--Date formate below is wrong-->
				<Abc011>1982-10-37</Abc011>
				
				<Abc012>
					<Abc013>veniam</Abc013>
					<Abc014>nostrud</Abc014>
				</Abc012>
			</Abc005>

			<!--the element below is not allowed-->
			<Abc15>Not allowed</Abc15>
		</Abc001>
	</Body>
</Message>

这是我生成的显示所有错误的 xml:

<MessageResponse xmlns="http://www.somesite.com/somefolder/messages">
    <Result>false</Result>
    <Status>Failed</Status>
    <FaultCount>4</FaultCount>
    <Faults>
        <Fault>
            <FaultCode>ERR01</FaultCode>
            <FaultMessage>The 'http://www.somesite.com/somefolder/messages:Abc004' element is invalid - The value 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.' is invalid according to its datatype 'String' - The actual length is greater than the MaxLength value.</FaultMessage>
        </Fault>
        <Fault>
            <FaultCode>ERR02</FaultCode>
            <FaultMessage>The element 'Abc007' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'ABC009' in namespace 'http://www.somesite.com/somefolder/messages'. List of possible elements expected: 'Abc009, Abc010' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage>
        </Fault>
        <Fault>
            <FaultCode>ERR03</FaultCode>
            <FaultMessage>The 'http://www.somesite.com/somefolder/messages:Abc011' element is invalid - The value '1982-10-37' is invalid according to its datatype 'http://www.w3.org/2001/XMLSchema:date' - The string '1982-10-37' is not a valid Date value.</FaultMessage>
        </Fault>
        <Fault>
            <FaultCode>ERR04</FaultCode>
            <FaultMessage>The element 'Abc001' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'Abc15' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage>
        </Fault>
    </Faults>
</MessageResponse>

这是奇怪的部分。当我在“Abc001”元素的开头引入一个错误,并保留所有其他现有错误时,结果完全一团糟。这是带有新引入错误的 XML:

<?xml version="1.0" encoding="utf-8"?>
<Message xmlns="http://www.somesite.com/somefolder/messages">
	<Header>
		<MessageId>Lorem</MessageId>
		<MessageSource>Ipsum</MessageSource>
	</Header>
	<Body>
		<Abc001>
			<!--newly introduced error - removed the following element-->
			<!--<Abc002>dolor</Abc002>-->
			<Abc003>sit amet</Abc003>
			<!--The value for Abc004 is increased beyond the allowed 200 characters-->
			<Abc004>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</Abc004>
			<Abc005>
				<Abc006>1234</Abc006>
				<Abc007>
					<Abc008>Ut enim</Abc008>
					<ABC009>AD</ABC009>
					<!--<Abc010>minim</Abc010>-->
				</Abc007>
				<Abc011>1982-10-37</Abc011>
				<Abc012>
					<Abc013>veniam</Abc013>
					<Abc014>nostrud</Abc014>
				</Abc012>
			</Abc005>
			<!--the element below is not allowed-->
			<Abc15>Not allowed</Abc15>
		</Abc001>
	</Body>
</Message>

最后,这是验证结果:

<MessageResponse xmlns="http://www.somesite.com/somefolder/messages">
    <Result>false</Result>
    <Status>Failed</Status>
    <FaultCount>1</FaultCount>
    <Faults>
        <Fault>
            <FaultCode>ERR01</FaultCode>
            <FaultMessage>The element 'Abc001' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'Abc003' in namespace 'http://www.somesite.com/somefolder/messages'. List of possible elements expected: 'Abc002' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage>
        </Fault>
    </Faults>
</MessageResponse>

这是我用来验证的 C# 代码:

public async Task<IMIDPreValidationAckMessage> ValidateXmlMessage( XDocument doc )
    {
        var result = new PreValidationAckMessage();
        result.Result = true;
        result.Status = "Succeeded";

        var xsd = HttpContext.Current.Server.MapPath( "~/message01.xsd" );

        try
        {
            var uri = new System.Uri(xsd);

            var localPath = uri.LocalPath;

            var docNameSpace = doc.Root.Name.Namespace.NamespaceName;

            XmlSchemaSet schemas = new XmlSchemaSet();
            schemas.Add( docNameSpace, localPath );

            XmlReaderSettings xrs = new XmlReaderSettings();
            xrs.ValidationType = ValidationType.Schema;
            xrs.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
            xrs.Schemas = schemas;

            result.XSDNamespace = doc.Root.GetDefaultNamespace().NamespaceName;
            var errCode = 1;

            xrs.ValidationEventHandler += ( s, e ) =>
            {
                var msg = e.Message;
                result.Result = false;
                result.Status = "Failed";
                result.FaultCount++;
                result.Faults.Add( new Fault
                {
                    FaultCode = "ERR" + errCode++.ToString().PadLeft( 2, '0' ),
                    FaultMessage = e.Message
                } );
            };

            using ( XmlReader xr = XmlReader.Create( doc.CreateReader(), xrs ) )
            {
                while ( xr.Read() ) { }
            }
        }
        catch ( System.Exception ex )
        {
            result.Result = false;
            result.Status = "Unknown Error";
        }
        return result;
    }

有人可以告诉我这里有什么问题吗?

最佳答案

XmlReader 似乎在第一次遇到错误时停止验证元素。这是旧的(过时的)XmlValidatingReader 描述的链接 ValidationEventHandler :

If an element reports a validation error, the rest of the content model for that element is not validated, however, its children are validated. The reader only reports the first error for a given element.

它似乎与常规 XmlReader 相同(尽管其文档没有明确提及)。

在第一个示例中,错误要么出现在最内层的元素中(例如元素的无效文本值),要么出现在最后一个子元素中,因此它们都会被报告而不会被跳过。但是在上一个示例中,您在根 Abc001 元素的开头引入了错误,因此将跳过其余的 Abc001 内容以及所有错误。

关于c# - XDocument.Validate 未捕获针对 XSD 的所有错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44330443/

相关文章:

c# - 在实时服务器上发送电子邮件时出错

c# - 更改任何 svg 文件中所有元素的颜色

xml - 使用 Relax NG 允许附加属性

c# - Debug.Assert 并不总是在 c# 中工作

java - "layout"与 Android Studio 中的 "layout resource"相同吗?如何以编程方式创建 XML 布局资源?

xml - 使用 XSD 中的属性限制复杂类型?

xml - XQuery 获取元素具有的所有属性的列表

java - 如何将下面的 xsd 文件转换为 java 文件?

xml - XSD maxOccurs ="unbounded"

c# - 如何识别 Page_Load 中的回发事件