查询在客户端应用程序中需要很长时间,但在SQL Server Management
我正在开发一个存储图像和相关元数据的应用程序.我在使用NHibernate执行某个查询时遇到了问题.查询耗时过长(在我的机器上类似于31秒),尽管在SQL Server Management Studio中执行时相同的查询只需要几分之一秒.
我已经将问题简化并解决了一个小的测试应用程序: 实体: 标记,由Id组成(字符串,标记值本身) public class Tag { public virtual string Id { get; set; } } 图像,包括Id(int),Name(字符串)和标签(多对多,Tag实例集) public class Image { private Iesi.Collections.Generic.ISet<Tag> tags = new HashedSet<Tag>(); public virtual int Id { get; set; } public virtual string Name { get; set; } public virtual IEnumerable<Tag> Tags { get { return tags; } } public virtual void AddTag(Tag tag) { tags.Add(tag); } } 我使用“按代码映射”使用以下映射: public class TagMapping : ClassMapping<Tag> { public TagMapping() { Id(x => x.Id,map => map.Generator(Generators.Assigned)); } } public class ImageMapping : ClassMapping<Image> { public ImageMapping() { Id(x => x.Id,map => map.Generator(Generators.Native)); Property(x => x.Name); Set(x => x.Tags,map => map.Access(Accessor.Field),map => map.ManyToMany(m2m => { })); } } NHibernate /数据库配置如下所示: <hibernate-configuration xmlns="urn:nhibernate-configuration-2.2"> <session-factory> <property name="dialect">NHibernate.Dialect.MsSql2008Dialect</property> <property name="connection.connection_string_name">PrimaryDatabase</property> <property name="format_sql">true</property> </session-factory> </hibernate-configuration> <connectionStrings> <add name="PrimaryDatabase" providerName="System.Data.SqlClient" connectionString="Data Source=.SQLEXPRESS;Initial Catalog=PerfTest;Integrated Security=True" /> </connectionStrings> 我想实现以下查询:给我所有图像,其中名称包含特定字符串或任何标记包含特定字符串.为了找到后者,我使用了一个子查询,它给了我带有匹配标签的所有图像的ID.因此,最终搜索条件是:图像具有包含特定字符串的名称,或者其ID是子查询返回的ID之一. 这是执行查询的代码: var term = "abc"; var mode = MatchMode.Anywhere; var imagesWithMatchingTag = QueryOver.Of<Image>() .JoinQueryOver<Tag>(x => x.Tags) .WhereRestrictionOn(x => x.Id).IsLike(term,mode) .Select(x => x.Id); var qry = session.QueryOver<Image>() .Where( Restrictions.On<Image>(x => x.Name).IsLike(term,mode) || Subqueries.WhereProperty<Image>(x => x.Id).In(imagesWithMatchingTag)) .List(); 我运行此查询的测试数据库(DBMS:SQL Server 2008 Express R2)是专门为此测试创建的,不包含任何其他内容.我用随机数据填充:10.000张图像(表格图像),4.000张标签(表格标签)和图像与标签之间大约200,000个关联(表格标签),即.每个图像有大约20个相关标签.数据库 SQL NHibernate声称使用的是: SELECT this_.Id as Id1_0_,this_.Name as Name1_0_ FROM Image this_ WHERE ( this_.Name like @p0 or this_.Id in ( SELECT this_0_.Id as y0_ FROM Image this_0_ inner join Tags tags3_ on this_0_.Id=tags3_.image_key inner join Tag tag1_ on tags3_.elt=tag1_.Id WHERE tag1_.Id like @p1 ) ); @p0 = '%abc%' [Type: String (4000)],@p1 = '%abc%' [Type: String (4000)] 鉴于我正在创建的查询,这看起来很合理. 如果我使用NHibernate运行此查询,则查询大约需要30秒(NHibernate.AdoNet.AbstractBatcher – ExecuteReader需要32964毫秒)并返回98个实体. 但是,如果我直接在Sql Server Management studio中执行等效查询: DECLARE @p0 nvarchar(4000) DECLARE @p1 nvarchar(4000) SET @p0 = '%abc%' SET @p1 = '%abc%' SELECT this_.Id as Id1_0_,this_.Name as Name1_0_ FROM Image this_ WHERE ( this_.Name like @p0 or this_.Id in ( SELECT this_0_.Id as y0_ FROM Image this_0_ inner join Tags tags3_ on this_0_.Id=tags3_.image_key inner join Tag tag1_ on tags3_.elt=tag1_.Id WHERE tag1_.Id like @p1 ) ); 查询花费不到一秒钟(并返回98个结果). 进一步实验: 如果我只按名称搜索或仅按标签搜索,即: var qry = session.QueryOver<Image>() .Where( Subqueries.WhereProperty<Image>(x => x.Id).In(imagesWithMatchingTag)) .List(); 要么 var qry = session.QueryOver<Image>() .Where(Restrictions.On<Image>(x => x.Name).IsLike(term,mode)) .List(); 查询很快. 如果我在子查询中不使用like但完全匹配: var imagesWithMatchingTag = QueryOver.Of<Image>() .JoinQueryOver<Tag>(x => x.Tags) .Where(x => x.Id == term) .Select(x => x.Id); 查询也很快. 将名称的匹配模式更改为Exact不会更改任何内容. 当我调试程序并在查询执行时暂停时,托管调用堆栈的顶部如下所示: [Managed to Native Transition] System.Data.dll!SNINativeMethodWrapper.SNIReadSync(System.Runtime.InteropServices.SafeHandle pConn,ref System.IntPtr packet,int timeout) + 0x53 bytes System.Data.dll!System.Data.SqlClient.TdsParserStateObject.ReadSni(System.Data.Common.DbAsyncResult asyncResult,System.Data.SqlClient.TdsParserStateObject stateObj) + 0xa3 bytes System.Data.dll!System.Data.SqlClient.TdsParserStateObject.ReadNetworkPacket() + 0x24 bytes System.Data.dll!System.Data.SqlClient.TdsParserStateObject.ReadBuffer() + 0x1f bytes System.Data.dll!System.Data.SqlClient.TdsParserStateObject.ReadByte() + 0x46 bytes System.Data.dll!System.Data.SqlClient.TdsParser.Run(System.Data.SqlClient.RunBehavior runBehavior,System.Data.SqlClient.SqlCommand cmdHandler,System.Data.SqlClient.SqlDataReader dataStream,System.Data.SqlClient.BulkCopySimpleResultSet bulkCopyHandler,System.Data.SqlClient.TdsParserStateObject stateObj) + 0x67 bytes System.Data.dll!System.Data.SqlClient.SqlDataReader.ConsumeMetaData() + 0x22 bytes System.Data.dll!System.Data.SqlClient.SqlDataReader.MetaData.get() + 0x57 bytes System.Data.dll!System.Data.SqlClient.SqlCommand.FinishExecuteReader(System.Data.SqlClient.SqlDataReader ds,System.Data.SqlClient.RunBehavior runBehavior,string resetOptionsString) + 0xe1 bytes ... 所以,我的问题是: >为什么NHibernate执行查询的时间要长得多,即使使用的SQL是相同的? 我知道查询一般不是世界上最有效的东西,但是我在这里引人注目的是使用NHibernate和manualy查询之间的区别.这里确实存在一些奇怪的事情. 很抱歉这篇文章很长,但我希望尽可能多地包含这个问题.非常感谢您的帮助! 更新1: SELECT this_.Id as Id1_0_,this_.Name as Name1_0_ FROM Image this_ WHERE (this_.Name like '%abc%' /* @p0 */ or this_.Id in (SELECT this_0_.Id as y0_ FROM Image this_0_ inner join Tags tags3_ on this_0_.Id = tags3_.image_key inner join Tag tag1_ on tags3_.elt = tag1_.Id WHERE tag1_.Id like '%abc%' /* @p1 */)) 这正是我之前发布的内容(因为这是NHibernate首先写入日志的内容). 这是NHProf的截图 警告是可以理解的,但不解释行为. 更新2 var imagesWithMatchingTag = QueryOver.Of<Image>() .JoinQueryOver<Tag>(x => x.Tags) .WhereRestrictionOn(x => x.Id).IsLike(term,mode) .Select(x => x.Id); var ids = imagesWithMatchingTag.GetExecutableQueryOver(session).List<int>().ToArray(); var qry = session.QueryOver<Image>() .Where( Restrictions.On<Image>(x => x.Name).IsLike(term,mode) || Restrictions.On<Image>(x => x.Id).IsIn(ids)) .List(); 虽然这确实使主查询再次快速,但我宁愿不采用这种方法,因为它不适合现实世界应用程序中的预期用法.有趣的是,这要快得多.我希望子查询方法同样快,因为它不依赖于外部查询. 更新3 var cmdText = @"SELECT this_.Id as Id1_0_,this_.Name as Name1_0_ FROM Image this_ WHERE (this_.Name like @p0 or this_.Id in (SELECT this_0_.Id as y0_ FROM Image this_0_ inner join Tags tags3_ on this_0_.Id = tags3_.image_key inner join Tag tag1_ on tags3_.elt = tag1_.Id WHERE tag1_.Id like @p1 ));"; using (var con = new SqlConnection(ConfigurationManager.ConnectionStrings["PrimaryDatabase"].ConnectionString)) { con.Open(); using (var txn = con.BeginTransaction()) { using (var cmd = new SqlCommand(cmdText,con,txn)) { cmd.CommandTimeout = 120; cmd.Parameters.AddWithValue("p0","%abc%"); cmd.Parameters.AddWithValue("p1","%abc%"); using (var reader = cmd.ExecuteReader()) { while (reader.Read()) { Console.WriteLine("Match"); } } } txn.Commit(); } } 更新4 查询计划(点击放大): 慢查询 快速查询 确切地说,计划有所不同. 更新5 因为Sql Server确实认为子查询是相关的,所以我尝试了不同的东西:我将与名称相关的标准单独移动到子查询: var term = "abc"; var mode = MatchMode.Anywhere; var imagesWithMatchingTag = QueryOver.Of<Image>() .JoinQueryOver<Tag>(x => x.Tags) .WhereRestrictionOn(x => x.Id).IsLike(term,mode) .Select(x => x.Id); var imagesWithMatchingName = QueryOver.Of<Image>() .WhereRestrictionOn(x => x.Name).IsLike(term,mode) .Select(x => x.Id); var qry = session.QueryOver<Image>() .Where( Subqueries.WhereProperty<Image>(x => x.Id).In(imagesWithMatchingName) || Subqueries.WhereProperty<Image>(x => x.Id).In(imagesWithMatchingTag) ).List(); 生成的SQL: SELECT this_.Id as Id1_0_,this_.Name as Name1_0_ FROM Image this_ WHERE ( this_.Id in ( SELECT this_0_.Id as y0_ FROM Image this_0_ inner join Tags tags3_ on this_0_.Id=tags3_.image_key inner join Tag tag1_ on tags3_.elt=tag1_.Id WHERE tag1_.Id like @p0 ) or this_.Id in ( SELECT this_0_.Id as y0_ FROM Image this_0_ WHERE this_0_.Name like @p1 ) ); @p0 = '%abc%' [Type: String (4000)],@p1 = '%abc%' [Type: String (4000)] 这似乎打破了相关性,结果查询再次“快速”(“快速”,如“当下可接受”).查询时间从30秒下降到~170ms.仍然不是轻量级查询,但至少可以让我从这里继续.我知道“喜欢’%foo%’”永远不会超级快.如果遇到最糟糕的情况,我仍然可以转移到专门的搜索服务器(Lucene,solr)或真正的全文搜索. 更新6 var qry = session.QueryOver(() => img) .Left.JoinQueryOver(x => x.Tags,() => tag) .Where( Restrictions.Like(Projections.Property(() => img.Name),term,mode) || Restrictions.Like(Projections.Property(() => tag.Id),mode)) .TransformUsing(Transformers.DistinctRootEntity) .List(); SQL: SELECT this_.Id as Id1_1_,this_.Name as Name1_1_,tags3_.image_key as image1_3_,tag1_.Id as elt3_,tag1_.Id as Id0_0_ FROM Image this_ left outer join Tags tags3_ on this_.Id=tags3_.image_key left outer join Tag tag1_ on tags3_.elt=tag1_.Id WHERE ( this_.Name like @p0 or tag1_.Id like @p1 ); @p0 = '%abc%' [Type: String (4000)],@p1 = '%abc%' [Type: String (4000)] 但是,查询现在比具有子查询的版本稍差.我会进一步研究这个问题. 解决方法我敢打赌,这是第二个慢的查询:var qry = session.QueryOver<Image>() .Where( Restrictions.On<Image>(x => x.Name).IsLike(term,mode) || Subqueries.WhereProperty<Image>(x => x.Id).In(imagesWithMatchingTag)) .List(); 您仅为第一个查询提供了SQL.那第二个怎么样?你在SQL Management Studio下测试了吗?使用SQL Server Profiler作为@JoachimIsaksson建议找出NHibernate执行服务器端的查询. 这看起来就像是将97个图像对象加载到内存中.它们各有多大? 编辑 另一个赌注是您的第一个查询执行第二个查询的广告内部查询.尝试在第一个查询上执行.List()以将标记加载到内存中. 编辑2 从查询计划看,您的查询看起来像是Correlated subquery. var qry = session.QueryOver<Image>() .Where( Subqueries.WhereProperty<Image>(x => x.Id).In(imagesWithMatchingTag)) .List(); 要么 var qry = session.QueryOver<Image>() .Where(Restrictions.On<Image>(x => x.Name).IsLike(term,mode)) .List(); 只需UNION它们就可以获得与单独运行它们相同的结果. 这就是IS IN(查询)的问题 – 您无法确定数据库是如何执行它的(除非您以某种方式强迫它使用某个计划).也许你可以改变.In()它以某种方式变成JoinQueryOver()? (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |