automatic-cache-DP

(13 mins to read)

https://codeforces.com/blog/entry/124683

个人认为不是很使用,仅作为开拓视野(🧐原来还可以这样)。

众所周知,在Python中存在@cache这个装饰器,从而可以非常方便的写一些递归式的DP。

1
2
3
4
from functools import cache
@cache
def factorial(n):
return n * factorial(n - 1) if n else 1

下面在C++中实现类似的功能。

use_cache的第一个模板参数Signature是函数签名,第二个模板参数Lambda则是被包装的lambda函数,它返回一个仿函数Cache,其成员包括该lambda函数f,以及用于记忆化的哈希表memo。所有f的参数会打包成一个tuple作为memo的key,而f的返回值则作为memo的value。

显然所有参数都是值拷贝的,要使用引用,需要用std::reference_wrapper进行包裹。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
template <typename Signature, typename Lambda>
struct Cache;

template <typename ReturnType, typename... Args, typename Lambda>
struct Cache<ReturnType(Args...), Lambda> {
template <typename... DummyArgs>
ReturnType operator()(DummyArgs&&... args) {
auto tied_args = std::tie(args...);
auto it = memo.find(tied_args);
if (it == memo.end()) {
auto&& ans = f(*this, std::forward<DummyArgs>(args)...);
memo[tied_args] = ans;
return ans;
} else {
return it->second;
}
}

template <class _Lambda>
Cache(std::tuple<>, _Lambda&& _f) : f(std::forward<_Lambda>(_f)) {}

Lambda f;
using TiedArgs = std::tuple<std::decay_t<Args>...>;
pbds::unordered_map<TiedArgs, ReturnType, hashing::custom_hash<TiedArgs>> memo;
};

template <class Signature, class Lambda>
auto use_cache(Lambda&& f) {
return Cache<Signature, Lambda>(std::tuple{}, std::forward<Lambda>(f));
}

为了支持较多的参数类型,如tuplecontainer等,需要手写hash。下面是一个很不错的示例,其实关键就是写一个好的hash_combine函数(显然异或是不好的)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
namespace hashing {
using i64 = std::int64_t;
using u64 = std::uint64_t;
static const u64 FIXED_RANDOM = std::chrono::steady_clock::now().time_since_epoch().count();

#if USE_AES
std::mt19937 rd(FIXED_RANDOM);
const __m128i KEY1{(i64)rd(), (i64)rd()};
const __m128i KEY2{(i64)rd(), (i64)rd()};
#endif

template <class T, class D = void>
struct custom_hash {};

// https://www.boost.org/doc/libs/1_55_0/doc/html/hash/combine.html
template <class T>
inline void hash_combine(u64& seed, const T& v) {
custom_hash<T> hasher;
seed ^= hasher(v) + 0x9e3779b97f4a7c15 + (seed << 12) + (seed >> 4);
};

// http://xorshift.di.unimi.it/splitmix64.c
template <class T>
struct custom_hash<T, typename std::enable_if<std::is_integral<T>::value>::type> {
u64 operator()(T _x) const {
u64 x = _x;
#if USE_AES
// implementation defined till C++17, defined from C++20
__m128i m{i64(u64(x) * 0xbf58476d1ce4e5b9u64), (i64)FIXED_RANDOM};
__m128i y = _mm_aesenc_si128(m, KEY1);
__m128i z = _mm_aesenc_si128(y, KEY2);
return z[0];
#else
x += 0x9e3779b97f4a7c15 + FIXED_RANDOM;
x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9;
x = (x ^ (x >> 27)) * 0x94d049bb133111eb;
return x ^ (x >> 31);
#endif
}
};

template <class T>
struct custom_hash<T, std::void_t<decltype(std::begin(std::declval<T>()))>> {
u64 operator()(const T& a) const {
u64 value = FIXED_RANDOM;
for (auto& x : a) hash_combine(value, x);
return value;
}
};

template <class... T>
struct custom_hash<std::tuple<T...>> {
u64 operator()(const std::tuple<T...>& a) const {
u64 value = FIXED_RANDOM;
std::apply([&value](T const&... args) { (hash_combine(value, args), ...); }, a);
return value;
}
};

template <class T, class U>
struct custom_hash<std::pair<T, U>> {
u64 operator()(const std::pair<T, U>& a) const {
u64 value = FIXED_RANDOM;
hash_combine(value, a.first);
hash_combine(value, a.second);
return value;
}
};

}; // namespace hashing

最后,众所周知std::unordered_map的访存很差,性能不好,所以作者选择了gp_hash_table,并提供了一些用户友好的alias。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include "ext/pb_ds/assoc_container.hpp"
#include "ext/pb_ds/tree_policy.hpp"

namespace pbds {
using namespace __gnu_pbds;
#ifdef PB_DS_ASSOC_CNTNR_HPP
template <class Key, class Value, class Hash>
using unordered_map = gp_hash_table<Key, Value, Hash, std::equal_to<Key>, direct_mask_range_hashing<>, linear_probe_fn<>,
hash_standard_resize_policy<hash_exponential_size_policy<>, hash_load_check_resize_trigger<>, true>>;
template <class Key, class Hash>
using unordered_set = pbds::unordered_map<Key, null_type, Hash>;
#endif
#ifdef PB_DS_TREE_POLICY_HPP
template <typename T>
using ordered_set = tree<T, null_type, std::less<T>, rb_tree_tag, tree_order_statistics_node_update>;
template <typename T>
using ordered_multiset = tree<T, null_type, std::less_equal<T>, rb_tree_tag, tree_order_statistics_node_update>;
template <class Key, class Value, class Compare = std::less<Key>>
using ordered_map = tree<Key, Value, Compare, rb_tree_tag, tree_order_statistics_node_update>;
#endif
} // namespace pbds

使用方式如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
auto factorial = use_cache<int(int)>([](auto&& self, int n) -> int {
if (n) return n * self(n - 1);
else return 1;
});

auto solve = use_cache<bool(char, pair<int, string>)>([&](auto&& self, char c, pair<int, string> p) -> bool {
// note that v was captured by &
// use c, p, v as needed
// let's say you have c1 and p1 as arguments for a recursive call
auto x = self(c1, p1); // recursive call - do as many of them as needed
// use c, p, v as needed
return y; // return something
});