c – 基于通用char []的存储并避免严格的别名相关的UB

我正在尝试构建一个类模板,将一大堆类型包含在适当的大型字符数组中,并允许以单独的正确类型引用访问数据.现在,根据标准,这可能会导致严格的别名违例,因此导致不确定的行为,因为我们通过与其不兼容的对象访问char []数据.具体来说,标准规定：

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

the dynamic type of the object,

a cv-qualified version of the dynamic type of the object,

a type similar (as defined in 4.4) to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including,recursively,an element or non-static data member of a subaggregate or contained union),

a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

a char or unsigned char type.

鉴于突出显示的项目符号的措辞,我想出了以下alias_cast的想法：

#include <iostream>
#include <type_traits>

template <typename T>
T alias_cast(void *p) {
    typedef typename std::remove_reference<T>::type BaseType;
    union UT {
        BaseType t;
    };
    return reinterpret_cast<UT*>(p)->t;
}

template <typename T,typename U>
class Data {
    union {
        long align_;
        char data_[sizeof(T) + sizeof(U)];
    };
public:
    Data(T t = T(),U u = U()) { first() = t; second() = u; }
    T& first() { return alias_cast<T&>(data_); }
    U& second() { return alias_cast<U&>(data_ + sizeof(T)); }
};


int main() {
    Data<int,unsigned short> test;
    test.first() = 0xdead;
    test.second() = 0xbeef;
    std::cout << test.first() << "," << test.second() << "\n";
    return 0;
}

(上面的测试代码,特别是Data类只是一个倾销演示的想法,所以请不要指出我应该如何使用std :: pair或std :: tuple,alias_cast模板也应该扩展到处理cv限定类型,只有满足对齐要求才能安全使用,但我希望这段代码足以证明这一想法.)

这个技巧使用g(当使用g -std = c 11 -Wall -Wextra -O2 -fstrict-aliasing -Wstrict-aliasing编译时)使警告静默,代码可以工作,但这是一个有效的方式来告诉编译器跳过严格的别名优化？

如果无效,那么在不违反别名规则的情况下,如何实现基于char数组的通用存储类呢？

编辑：
用简单的reinterpret_cast替换alias_cast,如下所示：

T& first() { return reinterpret_cast<T&>(*(data_ + 0)); }
U& second() { return reinterpret_cast<U&>(*(data_ + sizeof(T))); }

使用g编译时会产生以下警告：

aliastest-so-1.cpp: In instantiation of ‘T& Data::first() [with
T = int; U = short unsigned int]’: aliastest-so-1.cpp:28:16:
required from here aliastest-so-1.cpp:21:58: warning: dereferencing
type-punned pointer will break strict-aliasing rules
[-Wstrict-aliasing]

解决方法

使用联合几乎从来不是一个好主意,如果你想坚持严格的一致性,他们有严格的规则,当涉及到阅读活动成员(和这一个).虽然不得不说,实现喜欢使用工会作为可靠行为的钩子,也许这就是你以后.如果是这种情况,我推迟了写入 a nice (and long) article的别名规则的Mike Acton,在那里他通过工会进行评论.

据我所知,这是应该如何处理数组的char类型作为存储：

// char or unsigned char are both acceptable
alignas(alignof(T)) unsigned char storage[sizeof(T)];
::new (&storage) T;
T* p = static_cast<T*>(static_cast<void*>(&storage));

被定义为工作的原因是T是这里的对象的动态类型.当新表达式创建T对象时,存储被重用,该操作隐式地结束存储的生命周期(这发生在无符号字符是一个很好的,简单的类型).

您仍然可以使用例如存储[0]读取对象的字节,因为这是通过无符号字符类型的列表读取对象值,这是列出的显式异常之一.如果另一方面,存储是一个不同但仍然是微不足道的元素类型,你仍然可以使上面的代码段工作,但不能做存储[0].

使代码片段的最后一块是指针转换.请注意,在一般情况下,reinterpret_cast不适用.它可以是有效的,因为T是标准布局(还有对齐的额外限制),但如果是这种情况,那么使用reinterpret_cast将等同于static_casting通过void像我一样.首先直接使用该表单更有意义,特别是考虑到在通用上下文中使用存储会发生很多.在任何情况下,转换为和从void转换是标准转换之一(具有明确定义的含义),并且您希望对其进行static_cast.

如果你担心指针转换(这是我看来最薄弱的环节,而不是关于存储重用的争论),那么一个替代方法是做

T* p = ::new (&storage) T;

如果你想跟踪它,这将花费额外的存储指针.

我衷心地推荐使用std :: aligned_storage.

c – 基于通用char []的存储并避免严格的别名相关的UB

解决方法

相关文章