XML进阶:Level 1 - XML简介
By Rob Sheldon,2014/03/26 (首次发表于: 2012/09/20) 关于系列本文属于进阶系列:XML进阶 自2003年以来,XML一直是SQL标准的一部分,许多动态管理视图也返回XML数据,所以,对于每一个数据库管理员而言,同样是至关重要的。现在,这个行业更多地用于由文档标记定义的数据,数据库开发人员和数据库管理员比以往任何时候都更了解XML技术和使用它的方式。 在这一系列文章中,罗伯特·谢尔顿(Robert Sheldon)将尽力把复杂的东西简单化。 注意:这一系列的文章已经可作为电子书获取。 可扩展标记语言(XML)首先在SQL Server的SQL Server 2000发行版中提供支持。然而,与XML相关的功能仅限于专注于映射关系数据和XML数据的数据管理功能。例如,SQL Server 2000中添加了FOR XML子句,可以将关系查询结果作为XML返回。 但是到添加了XML数据类型的SQL Server 2005发布时,支持XML就变得有意义起来。 XML数据类型允许将XML文档本身存储在使用该类型配置的列和变量中。数据类型还支持一组可用于检索和修改XML文档中特定组件的方法。 为了充分利用SQL Server支持的与XML相关的功能,你可能会发现对XML本身有一个基本的了解是很有用的。为此,XML进阶系列的第一部分将介绍XML是什么,并对构成XML文档的各种组件进行介绍。 XML概述和超文本标记语言(HTML)相似,XML也是一种标记语言,它使用标签来描述和这些标签相关联的数据的性质。使XML可扩展的是它的自描述性质,也就是创建特定于XML文档中包含的数据值的标签。在HTML中,这些标签是预定义的。 (通过XML组件,XML的可扩展性将变得更加清晰。) 尽管XML具有可扩展性,但它仍然是一种标准化语言,必须符合由万维网联盟(W3C)定义的特定格式规则集。由于这种标准化与用于显示数据的HTML不同,这种语言被广泛地用于传输和存储数据。 XML可以轻松地在异构系统之间共享数据,而不管硬件,操作系统或应用程序类型如何,XML的普遍应用意味着数据可以更少的进行人为干预处理。同时可以控制数据的描述、排列和显示方式。 XML 组件构成XML文档的主要组件和支配这些组件使用的规则通常非常简单,但为了使XML文档能够被SQL Server XML解析器正确处理,你必须严格遵守一些规则。 XML文档中主要包含两种类型的信息:要存储的数据和描述数据的标签。 标签由一组尖括号(<>)组成,它们包含描述与标签相关联的数据的描述性词或复合词(无空格)。 正是由于这些标签的自描述性质,XML通常被认为是元语言。 每个离散的存储数据片段都被封装在开始标签和结束标签中,如下例所示: <Person>John Doe</Person> 在这种情况下,开始标签是 同样的,标签和封闭的数据代表一个单一的元素。 但是,元素并不总是包含数据。 一个空的元素可以以两种方式之一呈现。 第一个是通过指定打开和关闭标签,但不包括数据,如下例所示: <Person></Person> 表示一个空白元素的另一种方法是只使用一组括号,但仍包括正斜杠: <Person /> 再者,只有当元素不包含值时,才能使用这个方法。 以后你在XML进阶系列中可能看到模式需要一个没有值的元素。 在这种情况下,可以使用缩写格式来表示元素的两个标签。 无论元素是否包含值,只要使用两个标签,开始和关闭标签必须完全匹配,精确到大小写(除了结束标签中的正斜杠)。 例如,以下元素在SQL Server XML解析器中生成错误,因为两个标签不能匹配: <person> John Doe </ Person> 开始标签中的描述性文字都是小写的; 然而,结束标签中的描述性词应以大写字母开头。 开始和结束标签必须匹配,以被认为是适当的或者格式良好的XML。 但是你可以将元素嵌入到彼此中。 在以下示例中,我将 <People> <Person>John Doe</Person> <Person>Jane Doe</Person> </People> 请注意,每个 SQL Server还允许将XML片段存储在XML列或变量中。 片段是没有根元素的XML代码块,例如以下示例中显示的两个元素: <Person>John Doe</Person> <Place>Seattle,WA</Place> 元素必须仍然是格式良好的XML,也就是具有包含数据的匹配标签,但它们不一定是XML文档。 像以后你将在XML进阶系列中看到的那样,你可以指定只允许在XML列或变量中使用XML文档,但现在只要知道SQL Server可以区分和存储XML文档和片段即可。 在其他元素中嵌入元素时,必须确保子元素在完成父元素之前已经完成。 例如,在以下示例中,我在 <People><Person>John Doe</People></Person> 你必须确保无论多少级别包含嵌入元素,你的子元素都是完整的。 在下面的示例中, <People> <Person> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person> <FirstName>Jane</FirstName> <LastName>Doe</LastName> </Person> </People> 在这种情况下, 元素也可以具有与它们相关联的属性。 属性是可以赋值的属性。 该属性被定义为元素的开始标签的一部分。 在以下示例中,我已经将id属性添加到 <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> </Person> </People> 如示例所示,属性由属性名称(在这种情况下为id),后跟等号,属性值由双引号括起来。因此, 许多XML文档中包含的另一个组件是声明,它至少指定了文档符合的XML标准的版本。到目前为止,只有两个版本:1.0和1.1。如果使用XML 1.0,声明是不必要的;但是XML 1.1需要声明。因此,你应该知道如何在XML文档中包含声明。 如果你包含声明,则必须将其放在文档的开头,用 <?xml version="1.0" encoding="UTF-8"?> <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> </Person> </People> 您还可以向XML文档添加注释。 要做到这一点,只需在 <?xml version="1.0" encoding="UTF-8"?> <!-- A list of current clients --> <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> </Person> </People> 正如你所看到的那样,我已经添加了评论当前客户端的列表,附加在注释标签中。 SQL Server XML解析器将忽略标签内的任何内容,因此你不仅可以使用注释功能提供有关XML文档及其数据的信息,还可以挂起保留不想作为文档进行处理的XML代码部分。 另一个使用XML时要考虑的是,当元素值出现时,某些字符将无法解析。 例如,您不能在元素的值中包含&符(&),就像我在以下示例中的 <?xml version="1.0" encoding="UTF-8"?> <!-- A list of current clients --> <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> <FavoriteBook>Crime & Punishment</FavoriteBook> </Person> </People> 如果我尝试将此XML文档分配给XML列或变量,则 <?xml version="1.0" encoding="UTF-8"?> <!-- A list of current clients --> <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> <FavoriteBook>Crime & Punishment</FavoriteBook> </Person> </People> 请注意,我已经用了&amp;实体参考。现在XML解析器处理
该示例提出的另一个问题是子元素不一定要从一个父实例到另一个父实例相同。你可以看到, 总结在这个级别,我们已经看到组成一个XML文档需要的主要组件。 元素作为所有XML文档的基本构建块,每个元素由开始标签和结束标签划定,数据值本身被包含在这两个标签之间。 元素可以彼此嵌入,但是一个元素 - 根 - 必须作为XML文档中所有其他元素的父元素。 元素还可以包括被定义为元素的开始标签的一部分的属性。 我们已经尽可能方便地了解了如何组织一个XML文档,但本课程的目的并不是培养如何创建这些类型的文档,而是提供XML的介绍以便大家能更有效地在SQL Server中使用XML工作 。 在下一节,我们将介绍如何在SQL Server中实现XML数据类型,以及如何将其分配给列和变量以便存储XML文档和XML片段。 原文 Stairway to XML: Level 1 - Introduction to XMLBy Rob Sheldon,2014/03/26 (first published: 2012/09/20) The SeriesThis article is part of the Stairway Series: Stairway to >XML XML has been part of the SQL Standard since 2003,and it is also essential for any DBA because so many of the dynamic management views return XML data. Now that the industry is more used to data defined by document markup,it is becoming more important than ever for Database Developers and DBAs to understand the technology and to know where it makes sense to use XML. In this series of articles,Robert Sheldon flexes his talent to make the complicated seem simple. Note: This series of articles is now available as an eBook. Support for the eXtensible Markup Language (XML) was first introduced in SQL Server with the release of SQL Server 2000. However,XML-related features were limited to data management capabilities that focused on mapping relational and XML data. For example,SQL Server 2000 added the FOR XML clause,which lets you return relational query results as XML. However,it wasn’t until the release of SQL Server 2005—when the XML data type was added—that support for XML got interesting. The XML data type lets you natively store XML documents in columns and variables configured with that type. The data type also supports a set of methods you can use to retrieve and modify specific components within the XML document. To take full advantage of the XML-related features supported in SQL Server,you might find it useful to have a fundamental understanding of XML itself. To that end,this first Level of the Stairway to XML series explains what XML is and describes the various components that make up an XML document. An Overview of XMLSimilar to the HyperText Markup Language (HTML),XML is a markup language that uses tags to delineate and describe the nature of the data associated with those tags. What makes XML extensible is its self-describing nature,that is,you create tags that are specific to the data values contained in the XML document. In HTML,those tags are pre-defined. (XML’s extensible nature will become clearer as we work through the XML components.) Despite its extensibility,XML is still a standardized language that must conform to a specific set of formatting rules,as defined by the World Wide Web Consortium (W3C). Because of this standardization,the language has been widely adopted in order to transport and store data,unlike HTML,which is used to display data. XML makes it possible to easily share data among heterogeneous systems,regardless of hardware,operating system,or application type,and XML’s universal adoption means that data can be processed with little human intervention. At the same time,you can control how the data is described,while also controlling how the data is ordered and displayed. XML ComponentsThe primary components that make up an XML document—and the rules that govern the use of >those components—are generally very straightforward,but you must adhere strictly to these rules in order for an XML document to be properly processed by the SQL Server XML parser. There are primarily two types of information included in an XML document: the data to be stored and the tags that describe the data. A tag is made up of a set of angle brackets (< >) that enclose a descriptive word or compound word (no spaces) that describes the data associated with the tag. It’s because of the self-describing nature of these tags that XML is often considered a meta-language. Each discrete piece of stored data is enclosed in an opening tag and a closing tag,as shown in the following example: <Person>John Doe</Person> In this case,the opening tag is Together,the tags and enclosed data represent a single element. However,an element does not always have to contain data. An empty element can be rendered in one of two ways. The first is by specifying the opening and closing tags,but including no data,as I do in the following example: <Person></Person> Another way to represent an empty element is use only one set of brackets,but still include the forward slash: <Person /> Again,this method can be used only when an element contains no value. As you’ll see later in the Stairway to XML series,a schema might require an element for which there is no value. In that case,you can use the shortened format to represent the both tags of the element. Whether or not an element contains a value,whenever both tags are used,the opening and closing tags must match exactly,down to the capitalization (except for the forward slash in the closing tag). For instance,the following element generates an error in the SQL Server XML parser because the case is different between the two tags: <person>John Doe</Person> The descriptive word in the opening tag is all lowercase; however,the descriptive word in the closing tag begins with a capital letter. The opening and closing tags must match to be considered proper,or well formed,XML. You can,however,embed elements within each other. In the following example,I embed two instances of the <People> <Person>John Doe</Person> <Person>Jane Doe</Person> </People> Notice that each SQL Server also permits you to store XML fragments in an <Person>John Doe</Person> <Place>Seattle,WA</Place> The elements must still be well formed XML,have matching tags that enclose the data,but they don’t have to be an XML document. As you’ll see later in the Stairway to XML series,you can specify that only XML documents be permitted in an When you embed elements within other elements,you must ensure that the child elements are complete before you end the parent element. For instance,in the following example,I end the <People><Person>John Doe</People></Person> You must ensure that your child elements are complete no matter how many levels contain embedded elements. In this following example,the <People> <Person> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person> <FirstName>Jane</FirstName> <LastName>Doe</LastName> </Person> </People> In this case,the Elements can also have attributes associated with them. An attribute is a property to which you can assign a value. The attribute is defined as part of the element’s opening tag. In the following example,I’ve added the <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> </Person> </People> As the example demonstrates,an attribute consists of the attribute name (in this case, Another component contained in many XML documents is the declaration,which at a minimum specifies the version of the XML standard that the document conforms to. To date,there are only two versions: 1.0 and 1.1. If using XML 1.0,the declaration is not necessary; however,XML 1.1 requires one. For that reason,you should be aware of how to include a declaration in your XML document. If you include a declaration,you must place it at the beginning of the document,start the declaration with the <?xml version="1.0" encoding="UTF-8"?> <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> </Person> </People> You can also add comments to your XML documents. To do so,simply precede the comment with the <?xml version="1.0" encoding="UTF-8"?> <!-- A list of current clients --> <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> </Person> </People> As you can see,I’ve added the comment Another consideration when working with XML is that certain characters cannot be parsed when they appear in element values. For example,you cannot include an ampersand (&) in an element’s value,as I’ve done in the <?xml version="1.0" encoding="UTF-8"?> <!-- A list of current clients --> <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> <FavoriteBook>Crime & Punishment</FavoriteBook> </Person> </People> If I try to assign this <?xml version="1.0" encoding="UTF-8"?> <!-- A list of current clients --> <People> <Person id="1234"> <FirstName>John</FirstName> <LastName>Doe</LastName> </Person> <Person id="5678"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> <FavoriteBook>Crime & Punishment</FavoriteBook> </Person> </People> Notice that I’ve replaced the ampersand with the
Another issue that the example raises is the fact that the child elements do not have to be the same from one parent instance to the next. As you can see,the first instance of the SummaryIn this Level,we’ve looked at the primary components that make up an XML document. Elements serve as the basic building blocks for all XML documents,with each element being delineated by an opening tag and a closing tag and the data value itself being enclosed between those two tags. Elements can be embedded within each other,but one element—the root—must act as the parent to all other elements in an XML document. An element can also include attributes,which are defined as part of an element’s opening tag. As handy as it might be to know how to put together an XML document,the purpose of this Level has not been to train you in how to create these types of documents,but rather to provide an introduction to XML so you can more effectively work with XML in SQL Server. In the next Level,we’ll look at how the XML data type is implemented in SQL Server and how it can be assigned to columns and variables in order to store both XML documents and XML fragments. (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |