ruby-on-rails – 当没有设置行或表锁时,如何发生死锁? (由rails ActiveRecord#touch引起)

我正在广泛使用Rails 4的模板缓存功能.很多嵌套模板和触摸:很多模型都是如此.总的来说,它已被证明是一个易于推理的综合解决方案.

我最近实现了一个功能,其中创建了多个后台作业,这些作业遍历数百个对象并创建与其相关的其他对象.创建每个对象时,触摸相同的用户.

因此,这两个工作并行运行:

>作业A:每次触摸用户123时,连续创建数百个对象
>作业B:每次触摸用户123时,连续创建数百个对象

在部署此功能后,我发现有时postgres会检测到死锁并取消两个正在等待的查询.错误显示查询始终是触摸查询(使用当前时间更新用户123的updated_at).

pid: 33541 tid: 0 vtid:  LOG:  database system was shut down at 2014-03-27 19:57:11 EDT
pid: 33545 tid: 0 vtid:  LOG:  autovacuum launcher started
pid: 33539 tid: 0 vtid:  LOG:  database system is ready to accept connections
pid: 33626 tid: 0 vtid: 2/5 LOG:  statement: set client_encoding to 'UTF8'
pid: 33626 tid: 0 vtid: 2/6 LOG:  statement: set client_encoding to 'unicode'
pid: 33626 tid: 0 vtid: 2/7 LOG:  statement: SET client_min_messages TO 'warning'
pid: 33626 tid: 0 vtid: 2/8 LOG:  statement: SHOW client_min_messages
pid: 33626 tid: 0 vtid: 2/9 LOG:  statement: SET client_min_messages TO 'panic'
pid: 33626 tid: 0 vtid: 2/10 LOG:  statement: SET standard_conforming_strings = on
pid: 33626 tid: 0 vtid: 2/11 LOG:  statement: SET client_min_messages TO 'warning'
pid: 33626 tid: 0 vtid: 2/12 LOG:  statement: SET time zone 'UTC'
pid: 33626 tid: 0 vtid: 2/13 LOG:  statement: SELECT oid,typname,typelem,typdelim,typinput FROM pg_type
pid: 33626 tid: 0 vtid: 2/14 LOG:  statement: SHOW TIME ZONE
pid: 33626 tid: 0 vtid: 2/15 LOG:  statement: SELECT 1
pid: 33626 tid: 0 vtid: 2/16 LOG:  statement:               SELECT COUNT(*)
                  FROM pg_class c
                  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
                  WHERE c.relkind in ('v','r')
                  AND c.relname = 'users'
                  AND n.nspname = ANY (current_schemas(false))

pid: 33626 tid: 0 vtid: 2/17 LOG:  statement:             SELECT attr.attname
                FROM pg_attribute attr
                INNER JOIN pg_constraint cons ON attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1]
                WHERE cons.contype = 'p'
                  AND cons.conrelid = '"users"'::regclass

pid: 33626 tid: 0 vtid: 2/18 LOG:  statement: SELECT  "users".* FROM "users"   ORDER BY "users"."id" ASC LIMIT 1
pid: 33626 tid: 0 vtid: 2/19 LOG:  statement:               SELECT a.attname,format_type(a.atttypid,a.atttypmod),pg_get_expr(d.adbin,d.adrelid),a.attnotnull,a.atttypid,a.atttypmod
                    FROM pg_attribute a LEFT JOIN pg_attrdef d
                      ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                   WHERE a.attrelid = '"users"'::regclass
                     AND a.attnum > 0 AND NOT a.attisdropped
                   ORDER BY a.attnum

pid: 33627 tid: 0 vtid: 3/2 LOG:  statement: set client_encoding to 'UTF8'
pid: 33627 tid: 0 vtid: 3/3 LOG:  statement: set client_encoding to 'unicode'
pid: 33627 tid: 0 vtid: 3/4 LOG:  statement: SET client_min_messages TO 'warning'
pid: 33627 tid: 0 vtid: 3/5 LOG:  statement: SHOW client_min_messages
pid: 33627 tid: 0 vtid: 3/6 LOG:  statement: SET client_min_messages TO 'panic'
pid: 33627 tid: 0 vtid: 3/7 LOG:  statement: SET standard_conforming_strings = on
pid: 33627 tid: 0 vtid: 3/8 LOG:  statement: SET client_min_messages TO 'warning'
pid: 33627 tid: 0 vtid: 3/9 LOG:  statement: SET time zone 'UTC'
pid: 33627 tid: 0 vtid: 3/10 LOG:  statement: SELECT oid,typinput FROM pg_type
pid: 33627 tid: 0 vtid: 3/11 LOG:  statement: SHOW TIME ZONE
pid: 33627 tid: 0 vtid: 3/12 LOG:  statement: SELECT 1
pid: 33627 tid: 0 vtid: 3/13 LOG:  statement:               SELECT a.attname,a.atttypmod
                    FROM pg_attribute a LEFT JOIN pg_attrdef d
                      ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                   WHERE a.attrelid = '"users"'::regclass
                     AND a.attnum > 0 AND NOT a.attisdropped
                   ORDER BY a.attnum

pid: 33627 tid: 0 vtid: 3/14 LOG:  statement:               SELECT COUNT(*)
                  FROM pg_class c
                  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
                  WHERE c.relkind in ('v','r')
                  AND c.relname = 'users'
                  AND n.nspname = ANY (current_schemas(false))

pid: 33627 tid: 0 vtid: 3/15 LOG:  statement:             SELECT attr.attname
                FROM pg_attribute attr
                INNER JOIN pg_constraint cons ON attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1]
                WHERE cons.contype = 'p'
                  AND cons.conrelid = '"users"'::regclass

pid: 33627 tid: 0 vtid: 3/16 LOG:  statement: SELECT  id FROM "users"  WHERE (bared_at IS NOT NULL)  ORDER BY "users"."id" ASC LIMIT 1000
pid: 33628 tid: 0 vtid: 4/2 LOG:  statement: set client_encoding to 'UTF8'
pid: 33628 tid: 0 vtid: 4/3 LOG:  statement: set client_encoding to 'unicode'
pid: 33628 tid: 0 vtid: 4/4 LOG:  statement: SET client_min_messages TO 'warning'
pid: 33628 tid: 0 vtid: 4/5 LOG:  statement: SHOW client_min_messages
pid: 33628 tid: 0 vtid: 4/6 LOG:  statement: SET client_min_messages TO 'panic'
pid: 33628 tid: 0 vtid: 4/7 LOG:  statement: SET standard_conforming_strings = on
pid: 33628 tid: 0 vtid: 4/8 LOG:  statement: SET client_min_messages TO 'warning'
pid: 33628 tid: 0 vtid: 4/9 LOG:  statement: SET time zone 'UTC'
pid: 33628 tid: 0 vtid: 4/10 LOG:  statement: SELECT oid,typinput FROM pg_type
pid: 33628 tid: 0 vtid: 4/11 LOG:  statement: SHOW TIME ZONE
pid: 33628 tid: 0 vtid: 4/12 LOG:  statement: SELECT 1
pid: 33628 tid: 0 vtid: 4/13 LOG:  statement: SELECT  id FROM "users"  WHERE (bared_at IS NOT NULL)  ORDER BY "users"."id" ASC LIMIT 1000
pid: 33627 tid: 0 vtid: 3/17 LOG:  statement:               SELECT COUNT(*)
                  FROM pg_class c
                  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
                  WHERE c.relkind in ('v','r')
                  AND c.relname = 'monkeys'
                  AND n.nspname = ANY (current_schemas(false))

pid: 33628 tid: 0 vtid: 4/14 LOG:  statement:               SELECT COUNT(*)
                  FROM pg_class c
                  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
                  WHERE c.relkind in ('v','r')
                  AND c.relname = 'monkeys'
                  AND n.nspname = ANY (current_schemas(false))

pid: 33627 tid: 0 vtid: 3/18 LOG:  statement:             SELECT attr.attname
                FROM pg_attribute attr
                INNER JOIN pg_constraint cons ON attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1]
                WHERE cons.contype = 'p'
                  AND cons.conrelid = '"monkeys"'::regclass

pid: 33628 tid: 0 vtid: 4/15 LOG:  statement:             SELECT attr.attname
                FROM pg_attribute attr
                INNER JOIN pg_constraint cons ON attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1]
                WHERE cons.contype = 'p'
                  AND cons.conrelid = '"monkeys"'::regclass

pid: 33627 tid: 0 vtid: 3/19 LOG:  statement:               SELECT a.attname,a.atttypmod
                    FROM pg_attribute a LEFT JOIN pg_attrdef d
                      ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                   WHERE a.attrelid = '"users_monkeys"'::regclass
                     AND a.attnum > 0 AND NOT a.attisdropped
                   ORDER BY a.attnum

pid: 33628 tid: 0 vtid: 4/16 LOG:  statement:               SELECT a.attname,a.atttypmod
                    FROM pg_attribute a LEFT JOIN pg_attrdef d
                      ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                   WHERE a.attrelid = '"users_monkeys"'::regclass
                     AND a.attnum > 0 AND NOT a.attisdropped
                   ORDER BY a.attnum

pid: 33627 tid: 0 vtid: 3/20 LOG:  statement:               SELECT a.attname,a.atttypmod
                    FROM pg_attribute a LEFT JOIN pg_attrdef d
                      ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                   WHERE a.attrelid = '"monkeys"'::regclass
                     AND a.attnum > 0 AND NOT a.attisdropped
                   ORDER BY a.attnum

pid: 33628 tid: 0 vtid: 4/17 LOG:  statement:               SELECT a.attname,a.atttypmod
                    FROM pg_attribute a LEFT JOIN pg_attrdef d
                      ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                   WHERE a.attrelid = '"monkeys"'::regclass
                     AND a.attnum > 0 AND NOT a.attisdropped
                   ORDER BY a.attnum

pid: 33627 tid: 0 vtid: 3/21 LOG:  statement: SHOW search_path
pid: 33628 tid: 0 vtid: 4/18 LOG:  statement: SHOW search_path
pid: 33627 tid: 0 vtid: 3/23 LOG:  execute a1: SELECT COUNT(*) FROM "monkeys" INNER JOIN "users_monkeys" ON "monkeys"."id" = "users_monkeys"."monkey_id" WHERE "users_monkeys"."user_id" = $1 AND (((monkeys.name = 'recruiter') AND (monkeys.resource_type IS NULL) AND (monkeys.resource_id IS NULL)))
pid: 33627 tid: 0 vtid: 3/23 DETAIL:  parameters: $1 = '2'
pid: 33628 tid: 0 vtid: 4/20 LOG:  execute a1: SELECT COUNT(*) FROM "monkeys" INNER JOIN "users_monkeys" ON "monkeys"."id" = "users_monkeys"."monkey_id" WHERE "users_monkeys"."user_id" = $1 AND (((monkeys.name = 'recruiter') AND (monkeys.resource_type IS NULL) AND (monkeys.resource_id IS NULL)))
pid: 33628 tid: 0 vtid: 4/20 DETAIL:  parameters: $1 = '2'
pid: 33627 tid: 0 vtid: 3/24 LOG:  statement: SELECT  "foos".* FROM "foos"  WHERE "foos"."user_id" = 2 AND "foos"."fooer_id" = 1 LIMIT 1
pid: 33628 tid: 0 vtid: 4/21 LOG:  statement: SELECT  "foos".* FROM "foos"  WHERE "foos"."user_id" = 1 AND "foos"."fooer_id" = 2 LIMIT 1
pid: 33628 tid: 0 vtid: 4/22 LOG:  statement:               SELECT a.attname,a.atttypmod
                    FROM pg_attribute a LEFT JOIN pg_attrdef d
                      ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                   WHERE a.attrelid = '"foos"'::regclass
                     AND a.attnum > 0 AND NOT a.attisdropped
                   ORDER BY a.attnum

pid: 33627 tid: 0 vtid: 3/25 LOG:  statement:               SELECT a.attname,a.atttypmod
                    FROM pg_attribute a LEFT JOIN pg_attrdef d
                      ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                   WHERE a.attrelid = '"foos"'::regclass
                     AND a.attnum > 0 AND NOT a.attisdropped
                   ORDER BY a.attnum

pid: 33628 tid: 0 vtid: 4/23 LOG:  statement:               SELECT COUNT(*)
                  FROM pg_class c
                  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
                  WHERE c.relkind in ('v','r')
                  AND c.relname = 'foos'
                  AND n.nspname = ANY (current_schemas(false))

pid: 33627 tid: 0 vtid: 3/26 LOG:  statement:               SELECT COUNT(*)
                  FROM pg_class c
                  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
                  WHERE c.relkind in ('v','r')
                  AND c.relname = 'foos'
                  AND n.nspname = ANY (current_schemas(false))

pid: 33628 tid: 0 vtid: 4/24 LOG:  statement:             SELECT attr.attname
                FROM pg_attribute attr
                INNER JOIN pg_constraint cons ON attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1]
                WHERE cons.contype = 'p'
                  AND cons.conrelid = '"foos"'::regclass

pid: 33627 tid: 0 vtid: 3/27 LOG:  statement:             SELECT attr.attname
                FROM pg_attribute attr
                INNER JOIN pg_constraint cons ON attr.attrelid = cons.conrelid AND attr.attnum = cons.conkey[1]
                WHERE cons.contype = 'p'
                  AND cons.conrelid = '"foos"'::regclass

pid: 33628 tid: 0 vtid: 4/25 LOG:  statement: BEGIN
pid: 33627 tid: 0 vtid: 3/28 LOG:  statement: BEGIN
pid: 33628 tid: 0 vtid: 4/25 LOG:  statement: SELECT  1 AS one FROM "foos"  WHERE ("foos"."user_id" = 1 AND "foos"."id" != 4 AND "foos"."fooer_id" = 2) LIMIT 1
pid: 33627 tid: 0 vtid: 3/28 LOG:  statement: SELECT  1 AS one FROM "foos"  WHERE ("foos"."user_id" = 2 AND "foos"."id" != 3 AND "foos"."fooer_id" = 1) LIMIT 1
pid: 33628 tid: 0 vtid: 4/25 LOG:  statement:               SELECT a.attname,a.atttypmod
                    FROM pg_attribute a LEFT JOIN pg_attrdef d
                      ON a.attrelid = d.adrelid AND a.attnum = d.adnum
                   WHERE a.attrelid = '"users"'::regclass
                     AND a.attnum > 0 AND NOT a.attisdropped
                   ORDER BY a.attnum

pid: 33627 tid: 0 vtid: 3/28 LOG:  execute a2: SELECT  "users".* FROM "users"  WHERE "users"."id" = $1 LIMIT 1
pid: 33627 tid: 0 vtid: 3/28 DETAIL:  parameters: $1 = '2'
pid: 33627 tid: 0 vtid: 3/28 LOG:  statement: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.619282' WHERE "users"."id" = 2
pid: 33628 tid: 0 vtid: 4/25 LOG:  execute a2: SELECT  "users".* FROM "users"  WHERE "users"."id" = $1 LIMIT 1
pid: 33628 tid: 0 vtid: 4/25 DETAIL:  parameters: $1 = '1'
pid: 33627 tid: 6723 vtid: 3/28 LOG:  execute a2: SELECT  "users".* FROM "users"  WHERE "users"."id" = $1 LIMIT 1
pid: 33627 tid: 6723 vtid: 3/28 DETAIL:  parameters: $1 = '1'
pid: 33628 tid: 0 vtid: 4/25 LOG:  statement: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.627175' WHERE "users"."id" = 1
pid: 33627 tid: 6723 vtid: 3/28 LOG:  statement: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.628983' WHERE "users"."id" = 1
pid: 33628 tid: 6724 vtid: 4/25 LOG:  execute a2: SELECT  "users".* FROM "users"  WHERE "users"."id" = $1 LIMIT 1
pid: 33628 tid: 6724 vtid: 4/25 DETAIL:  parameters: $1 = '2'
pid: 33628 tid: 6724 vtid: 4/25 LOG:  statement: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.632111' WHERE "users"."id" = 2
pid: 33627 tid: 6723 vtid: 3/28 ERROR:  deadlock detected
pid: 33627 tid: 6723 vtid: 3/28 DETAIL:  Process 33627 waits for ShareLock on transaction 6724; blocked by process 33628.
    Process 33628 waits for ShareLock on transaction 6723; blocked by process 33627.
    Process 33627: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.628983' WHERE "users"."id" = 1
    Process 33628: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.632111' WHERE "users"."id" = 2

我很惊讶发生了这种死锁,因为行和表都没有被锁定.不应该第二次交易只是等待第一次完成?

更新:这里最小的sql-only bug:https://gist.github.com/jjb/9823023

解决方法

您的原始日志不完整,因为第一个UPDATE没有事务ID,因此它们不包含在原始日志中.

既然存在具有虚拟txids的日志,您可以看到Postgresql的行为完全符合设计.

相关的路线是:

pid: 33627 tid: 0 vtid: 3/28 LOG:  statement: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.619282' WHERE "users"."id" = 2

pid: 33628 tid: 0 vtid: 4/25 LOG:  statement: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.627175' WHERE "users"."id" = 1

pid: 33627 tid: 6723 vtid: 3/28 LOG:  statement: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.628983' WHERE "users"."id" = 1

pid: 33628 tid: 6724 vtid: 4/25 LOG:  statement: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.632111' WHERE "users"."id" = 2

pid: 33627 tid: 6723 vtid: 3/28 ERROR:  deadlock detected
pid: 33627 tid: 6723 vtid: 3/28 DETAIL:  Process 33627 waits for ShareLock on transaction 6724; blocked by process 33628.
    Process 33628 waits for ShareLock on transaction 6723; blocked by process 33627.
    Process 33627: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.628983' WHERE "users"."id" = 1
    Process 33628: UPDATE "users" SET "updated_at" = '2014-03-27 23:58:02.632111' WHERE "users"."id" = 2

这里:

> 3/28锁定id = 2进行更新
> 4/25锁定id = 1进行更新
> 3/28尝试锁定id = 1进行更新,阻止4/25持有的锁定
> 4/25尝试锁定id = 2进行更新,阻塞3/28所持有的锁定

在这一点上,两个事务都无法取得进展,因此Postgresql会中止其中一个事务.

为了防止这种情况发生,您的应用程序必须确保它始终以相同的顺序获取锁.如果不可能,它必须尝试在任何给定的事务中仅使用一个依赖对象,因此不会出现排序问题.或者它必须准备通过捕获异常并重新发出事务来处理死锁.

要了解更多信息,请参阅手册中的Explicit locking.

相关文章

validates:conclusion,:presence=>true,:inclusion=>{...
一、redis集群搭建redis3.0以前,提供了Sentinel工具来监控各...
分享一下我老师大神的人工智能教程。零基础!通俗易懂!风趣...
上一篇博文 ruby传参之引用类型 里边定义了一个方法名 mo...
一编程与编程语言 什么是编程语言? 能够被计算机所识别的表...
Ruby类和对象Ruby是一种完美的面向对象编程语言。面向对象编...