Efficient Use of PostgreSQL Indexes
转载网址:https://devcenter.heroku.com/articles/postgresql-indexes Table of Contents
There are many types of indexes in Postgres,as well as different ways to use them. In this article we give an overview of the types of indexes available,and explain different ways of using and maintaining the most common index type: B-Trees. An index is a way to efficiently retrieve a relatively small number of rows from a table. It is only useful if the number of rows to be retrieved from a table is relatively small (i.e. the condition for retrieving rows - the WHERE clause - is selective). B-Tree indexes are also useful for avoiding sorting. Index TypesPostgres supports many different index types:
This article is about how to get the most out of default B-Tree indexes. For examples of GIN and GiST index usage,refer to thecontrib packages. Why is my query not using an index?There are many reasons why the Postgres planner may choose to not use an index. Most of the time,the planner chooses correctly,even if it isn’t obvious why. It’s okay if the same query uses an index scan on some occasions but not others. The number of rows retrieved from the table may vary based on the particular constant values the query retrieves. So,for example,it might be correct for the query planner to use an index for the query Partial IndexesA partial index covers just a subset of a table’s data. It is an index with a WHERE clause. The idea is to increase the efficiency of the index by reducing its size. A smaller index takes less storage,is easier to maintain,and is faster to scan. For example,suppose you allow users to flag comments on your site,which in turn sets the CREATEINDEXarticles_flagged_created_at_indexONarticles(created_at)WHEREflaggedISTRUE; This index will remain fairly small,and can also be used along other indexes on the more complex queries that may require it. Expression IndexesExpression indexes are useful for queries that match on some function or modification of your data. Postgres allows you to index the result of that function so that searches become as efficient as searching by raw data values. For example,you may require users to store their email addresses for signing in,but you want case insensitive authentication. In that case it’s possible to store the email address as is,but do searches on CREATEINDEXusers_lower_emailONusers(lower(email)); Another common example is for finding rows for a given date,where we’ve stored timestamps in a datetime field but want to find them by a date casted value. An index like Unique IndexesA unique index guarantees that the table won’t have more than one row with the same value. It’s advantageous to create unique indexes for two reasons: data integrity and performance. Lookups on a unique index are generally very fast. In terms of data integrity,using a There is little distinction between unique indexes and unique constraints. Unique indexes can be though of as lower level,since expression indexes and partial indexes cannot be created as unique constraints. Even partial unique indexes on expressions are possible. Multi-column IndexesWhile Postgres has the ability to create multi-column indexes,it’s important to understand when it makes sense to do so. The Postgres query planner has the ability to combine and use multiple single-column indexes in a multi-column query by performing a bitmap index scan. In general,you can create an index on every column that covers query conditions and in most cases Postgres will use them,so make sure to benchmark and justify the creation of a multi-column index before you create them. As always,indexes come with a cost,and multi-column indexes can only optimize the queries that reference the columns in the index in the same order,while multiple single column indexes provide performance improvements to a larger number of queries. However there are cases where a multi-column index clearly makes sense. An index on columns B-Trees and sortingB-Tree index entries are sorted in ascending order by default. In some cases it makes sense to supply a different sort order for an index. Take the case when you’re showing a paginated list of articles,sorted by most recent published first. We may have a In this case we can create an index like so: CREATEINDEXarticles_published_at_indexONarticles(published_atDESCNULLSLAST); In Postgres 9.2 and above,it’s of note that indexes are not always required to go to the table,provided we can get everything needed from the index (i.e. no unindexed columns are of interest). This feature is called “Index-only scans”. Since we will be querying the table in sorted order by This technique is mostly relevant with single column indexes when you require “nulls to sort last” behavior,because otherwise the order is already available since an index can be scanned in any direction. It becomes even more relevant when used against a multi-column index when a query requests a mixed sort order,like Managing and Maintaining indexesIndexes in Postgres do not hold all row data. Even when an index is used in a query and matching rows where found,Postgres will go to disk to fetch the row data. Additionally,row visibility information (discussed inthe MVCC article) is not stored on the index either,therefore Postgres must also go to disk to fetch that information. Having that in mind,you can see how in some cases using an index doesn’t really make sense. An index must be selective enough to reduce the number of disk lookups for it to be worth it. For example,a primary key lookup with a big enough table makes good use of an index: instead of sequentially scanning the table matching the query conditions,Postgres is able to find the targeted rows in an index,and then fetch them from disk selectively. For very small tables,for example a cities lookup table,an index may be undesirable,even if you search by city name. In that case,Postgres may decide to ignore the index in favor of a sequential scan. Postgres will decide to perform a sequential scan on any query that will hit a significant portion of a table. If you do have an index on that column,it will be a dead index that’s never used - and indexes are not free: they come at a cost in terms of storage and maintenance. For more on running production,staging,and other environments for your Heroku application,take a look at ourManaging Multiple Environmentsarticle. When tuning a query and understanding what indexes make the most sense,never try to it on your development machine. Whether an index is used or not depends on a number of factors,including the Postgres server configuration,the data in the table,the index and the query. For instance,trying to make a query use an index on your development machine with a small subset of “test data” will be frustrating: Postgres will determine that the dataset is so small that it’s not worth the overhead of reading through the index and then fetching the data from disk. Random I/O is much slower than sequential,so the cost of a sequential scan is lower than that of the random I/O introduced by reading the index and selectively finding the data on disk. Performing index tuning should be done on production,or on a staging environment that is as close to production as possible. On the Heroku Postgres database platform it is possible to copy your production database to a different environmentquite easily. When you are ready to apply an index on your production database,keep in mind that creating an index locks the table against writes. For big tables that can mean your site is down for hours. Fortunately Postgres allows you to Finally,indexes will become fragmented and unoptimized after some time,especially if the rows in the table are often updated or deleted. In those cases it may be required to perform a Postgres provides a lot of flexibility when it comes to creating B-tree indexes that are optimized to your specific use cases,as well as options for managing the ever-growing database behind your applications. These tips should help you keep your database healthy,and your queries snappy. (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |