POSTGRESQL HOT(HEAP-ONLY TUPLES)

http://archives.postgresql.org/pgsql-patches/2007-09/msg00261.php

PostgreSQL implements HOT (Heap Only Tuples),a way for the server to limit the work it has to make when updating tuples. That's what we call an optimization :)

Postgresql MVCC implementation choice means that updating a tuple create a entire new version of it and mark the old one as no longer valid (as of the updating transaction id). Then VACUUM will have to clean out the old reference as soon as possible.

Let's not forget that the indexes pointing the the old tuples need to point to the new version of it as of transaction id. Postgresql currently does not save visibility information into the index,though,reducing the janitoring here. But still,for the index,the operation of updating a tuple is equivalent to a delete and an insert. That's before HOT.

Starting with Postgresql 8.3,when a tuple is updated and if the update only concerns non-indexed columns,the RDBMS is smart enough for the existing indexes not to need any update at all.

This is done by creating a new tuple if possible on the same page as the old one,and maintaining a chain of updated tuples linking a new version to the old one. An HOT tuple is in fact one that can't be reached from any index. VACUUM will Now only have to prune the tuple versions of the chain that are no more visible,and as no index were updated (there was no need to),there's no VACUUM work to get done on the indexes.

Of course,for HOT to work properly,Postgresql has Now to follow each HOT chain when SELECT 'ing tuples and using an index,but the same amount of tuples version was to be read before HOT too. The difference is that with HOT the new versions of the HOT-updated tuples are no more reachable via the index directly,so Postgresql has to follow the chain when reading the heap.

Glossary

This comes non-edited from the developer documentation of HOT...

broken HOT Chain

A HOT chain in which the key value for an index has changed.

This is not allowed to occur normally but if a new index is created it can happen. In that case varIoUs strategies are used to ensure that no transaction for which the older tuples are visible can use the index.

Cold update

A normal,non-HOT update,in which index entries are made for the new version of the tuple.

Dead line pointer

A stub line pointer,that does not point to anything,but cannot be removed or reused yet because there are index pointers to it. Semantically same as a dead tuple. It has state LP_DEAD.

Heap-only tuple

A heap tuple with no index pointers,which can only be reached from indexes indirectly through its ancestral root tuple. Marked with HEAP_ONLY_TUPLE flag.

HOT-safe

A proposed tuple update is said to be HOT-safe if it changes none of the tuple's indexed columns. It will only become an actual HOT update if we can find room on the same page for the new tuple version.

HOT update

An UPDATE where the new tuple becomes a heap-only tuple,and no new index entries are made.

HOT-updated tuple

An updated tuple,for which the next tuple in the chain is a heap-only tuple. Marked with HEAP_HOT_UPDATED flag.

Indexed column

A column used in an index deFinition. The column might not actually be stored in the index --- it Could be used in a functional index's expression,or used in a partial index predicate. HOT treats all these cases alike.

Redirecting line pointer

A line pointer that points to another line pointer and has no associated tuple. It has the special lp_flags state LP_REDIRECT,and lp_off is the OffsetNumber of the line pointer it links to. This is used when a root tuple becomes dead but we cannot prune the line pointer because there are non-dead heap-only tuples further down the chain.

Root tuple

The first tuple in a HOT update chain; the one that indexes point to.

Update chain

A chain of updated tuples,in which each tuple's ctid points to the next tuple in the chain. A HOT update chain is an update chain (or portion of an update chain) that consists of a root tuple and one or more heap-only tuples. A complete update chain can contain both HOT and non-HOT (cold) updated tuples.

相关文章

项目需要,有个数据需要导入,拿到手一开始以为是mysql,结果...
本文小编为大家详细介绍“怎么查看PostgreSQL数据库中所有表...
错误现象问题原因这是在远程连接时pg_hba.conf文件没有配置正...
因本地资源有限,在公共测试环境搭建了PGsql环境,从数据库本...
wamp 环境 这个提示就是说你的版本低于10了。 先打印ph...
psycopg2.OperationalError: SSL SYSCALL error: EOF detect...