计算字符串的相似度(VB2005)——思索之四
参阅本文章之前,请先参阅以下文章 计算字符串的相似度(VB2005)——思索之一 计算字符串的相似度(VB2005)——思索之二 计算字符串的相似度(VB2005)——思索之三
在思索之三的程序完成之后,经测试,结果和速度都令人满意,稍显美中不足的是就是空间复杂度还是比较高,为O(S1×S2),当S1和S2都比较大的时候,可能会占用非常多的空间。 如何解决这个问题呢? 经过对计算过程的分析,我发现作为存储的二维矩阵,在每一个循环中,其实只有一行的数据参与了计算,之前的数据行就不再参与计算了。因此,从这个出发点入手,对代码进行了微调,将二维数组改为一维数组。经测试,结果和速度与之前思索之三中的代码没有差异。但空间复杂度少了很多,为O(S1)。 有意思的是,再对代码进行测试时,无意中发现了之前的代码中存在一个不起眼的逻辑错误。请参阅者自行考量。 现将代码赋予其后,用的是VB2005
Public Class clsDistance Private mCharA() As Char Private mCharB() As Char Private mCharALen As Integer Private mCharBLen As Integer
Public Sub New(ByVal StrA As String,ByVal StrB As String) mCharA = StrA.ToCharArray mCharB = StrB.ToCharArray mCharALen = mCharA.Length mCharBLen = mCharB.Length End Sub Public Function CacuDistance() As Integer Dim i As Integer If mCharALen = 0 Then Return mCharBLen If mCharBLen = 0 Then Return mCharALen Dim j As Integer = Min(mCharALen,mCharBLen) - 1 Dim tP1 As Integer,tP2 As Integer tP1 = -1 tP2 = -1 For i = 0 To j If mCharA(i) <> mCharB(i) Then tP1 = i Exit For End If Next If tP1 = -1 Then Return Math.Abs(mCharALen - mCharBLen) For i = 0 To j - tP1 If mCharA(mCharALen - i - 1) <> mCharB(mCharBLen - i - 1) Then tP2 = i If tP2 = -1 Then Return Math.Abs(mCharALen - mCharBLen) Dim tA(mCharALen - tP1 - tP2) As Integer For i = 0 To tA.GetUpperBound(0) tA(i) = i Dim tN1 As Integer,tN2 As Integer,tN3 As Integer For i = 0 To mCharBLen - tP1 - tP2 - 1 tN1 = tA(0) tN2 = tN1 + 1 For j = 1 To tA.GetUpperBound(0) If mCharA(mCharALen - tP2 - j) = mCharB(mCharBLen - tP2 - i - 1) Then tN3 = tN1 Else tN3 = Min(tA(j),tN1,tN2) + 1 tA(j - 1) = tN2 tN2 = tN3 tN1 = tA(j) tA(tA.GetUpperBound(0)) = tN2 Return tA(tA.GetUpperBound(0)) End Function Public Function Min(ByVal ParamArray Num() As Integer) As Integer Dim tN As Integer,i As Integer If Num.Length = 0 Then Return Nothing tN = Num(0) For i = 1 To Num.GetUpperBound(0) If Num(i) < tN Then tN = Num(i) Return tN End Function End Class (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |