[Swift]LeetCode591. 标签验证器 | Tag Validator

发布时间：2020-12-14 05:03:52 所属栏目：百科来源：网络整理

导读：Given a string representing a code snippet,you need to implement a tag validator to parse the code and return whether it is valid. A code snippet is valid if all the following rules hold:? The code must be wrapped in a?valid closed tag. Ot

Given a string representing a code snippet,you need to implement a tag validator to parse the code and return whether it is valid. A code snippet is valid if all the following rules hold:?

The code must be wrapped in a?valid closed tag. Otherwise,the code is invalid.
A?closed tag?(not necessarily valid) has exactly the following format :?<TAG_NAME>TAG_CONTENT</TAG_NAME>. Among them,?<TAG_NAME>?is the start tag,and?</TAG_NAME>?is the end tag. The TAG_NAME in start and end tags should be the same. A closed tag is?valid?if and only if the TAG_NAME and TAG_CONTENT are valid.
A?valid?TAG_NAME?only contain?upper-case letters,and has length in range [1,9]. Otherwise,the?TAG_NAME?is?invalid.
A?valid?TAG_CONTENT?may contain other?valid closed tags,?cdata?and any characters (see note1)?EXCEPT?unmatched?<,unmatched start and end tag,and unmatched or closed tags with invalid TAG_NAME. Otherwise,the?TAG_CONTENT?is?invalid.
A start tag is unmatched if no end tag exists with the same TAG_NAME,and vice versa. However,you also need to consider the issue of unbalanced when tags are nested.
A?<?is unmatched if you cannot find a subsequent?>. And when you find a?<?or?</,all the subsequent characters until the next?>?should be parsed as TAG_NAME (not necessarily valid).
The cdata has the following format :?<![CDATA[CDATA_CONTENT]]>. The range of?CDATA_CONTENT?is defined as the characters between?<![CDATA[?and the?first subsequent?]]>.
CDATA_CONTENT?may contain?any characters. The function of cdata is to forbid the validator to parse?CDATA_CONTENT,so even it has some characters that can be parsed as tag (no matter valid or invalid),you should treat it as?regular characters.

Valid Code Examples:

Input: "<DIV>This is the first line <![CDATA[<div>]]></DIV>"

Output: True

Explanation: 

The code is wrapped in a closed tag : <DIV> and </DIV>. 

The TAG_NAME is valid,the TAG_CONTENT consists of some characters and cdata. 

Although CDATA_CONTENT has unmatched start tag with invalid TAG_NAME,it should be considered as plain text,not parsed as tag.

So TAG_CONTENT is valid,and then the code is valid. Thus return true.


Input: "<DIV>>>  ![cdata[]] <![CDATA[<div>]>]]>]]>>]</DIV>"

Output: True

Explanation:

We first separate the code into : start_tag|tag_content|end_tag.

start_tag -> "<DIV>"

end_tag -> "</DIV>"

tag_content could also be separated into : text1|cdata|text2.

text1 -> ">>  ![cdata[]] "

cdata -> "<![CDATA[<div>]>]]>",where the CDATA_CONTENT is "<div>]>"

text2 -> "]]>>]"


The reason why start_tag is NOT "<DIV>>>" is because of the rule 6.
The reason why cdata is NOT "<![CDATA[<div>]>]]>]]>" is because of the rule 7.?

Invalid Code Examples:

Input: "<A>  <B> </A>   </B>"
Output: False
Explanation: Unbalanced. If "<A>" is closed,then "<B>" must be unmatched,and vice versa.

Input: "<DIV>  div tag is not closed  <DIV>"
Output: False

Input: "<DIV>  unmatched <  </DIV>"
Output: False

Input: "<DIV> closed tags with invalid tag name  <b>123</b> </DIV>"
Output: False

Input: "<DIV> unmatched tags with invalid tag name  </1234567890> and <CDATA[[]]>  </DIV>"
Output: False

Input: "<DIV>  unmatched start tag <B>  and unmatched end tag </C>  </DIV>"
Output: False?

Note:

For simplicity,you could assume the input code (including the?any characters?mentioned above) only contain?letters,?digits,?‘<‘,‘>‘,‘/‘,‘!‘,‘[‘,‘]‘?and?‘ ‘.

给定一个表示代码片段的字符串，你需要实现一个验证器来解析这段代码，并返回它是否合法。合法的代码片段需要遵守以下的所有规则：

代码必须被合法的闭合标签包围。否则，代码是无效的。
闭合标签（不一定合法）要严格符合格式：<TAG_NAME>TAG_CONTENT</TAG_NAME>。其中，<TAG_NAME>是起始标签，</TAG_NAME>是结束标签。起始和结束标签中的 TAG_NAME 应当相同。当且仅当?TAG_NAME 和 TAG_CONTENT 都是合法的，闭合标签才是合法的。
合法的?TAG_NAME?仅含有大写字母，长度在范围 [1,9] 之间。否则，该?TAG_NAME?是不合法的。
合法的?TAG_CONTENT?可以包含其他合法的闭合标签，cdata?（请参考规则7）和任意字符（注意参考规则1）除了不匹配的<、不匹配的起始和结束标签、不匹配的或带有不合法 TAG_NAME 的闭合标签。否则，TAG_CONTENT?是不合法的。
一个起始标签，如果没有具有相同?TAG_NAME 的结束标签与之匹配，是不合法的。反之亦然。不过，你也需要考虑标签嵌套的问题。
一个<，如果你找不到一个后续的>与之匹配，是不合法的。并且当你找到一个<或</时，所有直到下一个>的前的字符，都应当被解析为?TAG_NAME（不一定合法）。
cdata 有如下格式：<![CDATA[CDATA_CONTENT]]>。CDATA_CONTENT?的范围被定义成?<![CDATA[?和后续的第一个?]]>之间的字符。
CDATA_CONTENT?可以包含任意字符。cdata 的功能是阻止验证器解析CDATA_CONTENT，所以即使其中有一些字符可以被解析为标签（无论合法还是不合法），也应该将它们视为常规字符。

合法代码的例子:

输入: "<DIV>This is the first line <![CDATA[<div>]]></DIV>"

输出: True

解释: 

代码被包含在了闭合的标签内： <DIV> 和 </DIV> 。

TAG_NAME 是合法的，TAG_CONTENT 包含了一些字符和 cdata 。 

即使 CDATA_CONTENT 含有不匹配的起始标签和不合法的 TAG_NAME，它应该被视为普通的文本，而不是标签。

所以 TAG_CONTENT 是合法的，因此代码是合法的。最终返回True。


输入: "<DIV>>>  ![cdata[]] <![CDATA[<div>]>]]>]]>>]</DIV>"

输出: True

解释:

我们首先将代码分割为： start_tag|tag_content|end_tag 。

start_tag -> "<DIV>"

end_tag -> "</DIV>"

tag_content 也可被分割为： text1|cdata|text2 。

text1 -> ">>  ![cdata[]] "

cdata -> "<![CDATA[<div>]>]]>" ，其中 CDATA_CONTENT 为 "<div>]>"

text2 -> "]]>>]"


start_tag 不是 "<DIV>>>" 的原因参照规则 6 。
cdata 不是 "<![CDATA[<div>]>]]>]]>" 的原因参照规则 7 。

不合法代码的例子:

输入: "<A>  <B> </A>   </B>"
输出: False
解释: 不合法。如果 "<A>" 是闭合的，那么 "<B>" 一定是不匹配的，反之亦然。

输入: "<DIV>  div tag is not closed  <DIV>"
输出: False

输入: "<DIV>  unmatched <  </DIV>"
输出: False

输入: "<DIV> closed tags with invalid tag name  <b>123</b> </DIV>"
输出: False

输入: "<DIV> unmatched tags with invalid tag name  </1234567890> and <CDATA[[]]>  </DIV>"
输出: False

输入: "<DIV>  unmatched start tag <B>  and unmatched end tag </C>  </DIV>"
输出: False

注意:

为简明起见，你可以假设输入的代码（包括提到的任意字符）只包含数字,?字母,‘]‘和‘ ‘。

Runtime:?20 ms

Memory Usage:?19.8 MB

 1 class Solution {
 2     func isValid(_ code: String) -> Bool {
 3         var st:[String] = [String]()
 4         var i:Int = 0
 5         while(i < code.count)
 6         {            
 7             if i > 0 && st.isEmpty
 8             {
 9                 return false                
10             }
11             if code.subString(i,9) == "<![CDATA["
12             {
13                 var j:Int = i + 9
14                 i = code.find("]]>",j)
15                 if i < 0 {return false}
16                 i += 2
17             }
18             else if code.subString(i,2) == "</"
19             {
20                 var j:Int = i + 2
21                 i = code.find(">",j)
22                 if i < 0 {return false}
23                 var tag:String = code.subString(j,i - j)
24                 if st.isEmpty || st.last! != tag
25                 {
26                     return false
27                 }
28                 st.popLast()
29             }
30             else if code.subString(i,1) == "<"
31             {
32                 var j:Int = i + 1
33                 i = code.find(">",j)
34                 if i < 0 || i == j || i - j > 9
35                 {
36                     return false
37                 }
38                 for k in j..<i
39                 {
40                     if code[k] < "A" || code[k] > "Z"
41                     {
42                         return false
43                     }                    
44                 }
45                 var tag:String = code.subString(j,i - j)
46                 st.append(tag)
47             }    
48             i += 1
49         }
50         return st.isEmpty
51     } 
52 }
53 
54 //String扩展
55 extension String {
56     //subscript函数可以检索数组中的值
57     //直接按照索引方式截取指定索引的字符
58     subscript (_ i: Int) -> Character {
59         //读取字符
60         get {return self[index(startIndex,offsetBy: i)]}
61     }
62         
63     // 截取字符串：指定索引和字符数
64     // - begin: 开始截取处索引
65     // - count: 截取的字符数量
66     func subString(_ begin:Int,_ count:Int) -> String {
67         let start = self.index(self.startIndex,offsetBy: max(0,begin))
68         let end = self.index(self.startIndex,offsetBy:  min(self.count,begin + count))
69         return String(self[start..<end]) 
70     }
71 
72     // 截取字符串：从index到结束处
73     // - Parameter index: 开始索引
74     // - Returns: 子字符串
75     func subStringFrom(_ index: Int) -> String {
76         let theIndex = self.index(self.endIndex,offsetBy: index - self.count)
77         return String(self[theIndex..<endIndex])
78     }
79     
80     //从0索引处开始查找是否包含指定的字符串，返回Int类型的索引
81     //返回第一次出现的指定子字符串在此字符串中的索引
82     func find(_ sub:String)->Int {
83         var pos = -1
84         if let range = range(of:sub,options: .literal ) {
85             if !range.isEmpty {
86                 pos = self.distance(from:startIndex,to:range.lowerBound)
87             }
88         }
89         return pos
90     }
91     
92     //从指定索引处开始查找是否包含指定的字符串，返回Int类型的索引
93     //返回第一次出现的指定子字符串在此字符串中的索引
94     func find(_ sub:String,_ begin:Int)->Int {
95         var str:String = self.subStringFrom(begin)
96         var pos:Int = str.find(sub)
97         return pos == -1 ? -1 : (pos + begin)
98     }
99 }

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!