(试图)扒一扒.NetCore中的MemoryCache

本文示例代码: https://github.com/zhaokuohaha/SimpleTest/blob/master/SimpleTests/MemoryCacheTests.cs

问题排查

上周项目中遇到一个奇怪的问题: 在查询某个页面的数据的时候, 对该页面的配置项读取出现了恨不规则的错乱. 切换一个任务, 页面的配置项就会出现非预期的错误配置. 经过排查, 发现是缓存的问题.

memorycache-2

由于配置项构造比较复杂, 而第一版的时候配置是固定的, 所以当时就把构造后的配置结果缓存到 MemoryCache 中, 后来对于每个任务, 配置项中的某几项都会进行对应的更改. 但因为改动小, 所以便不再进行重新缓存, 而是读取缓存后修改对应的子项. 伪码如下:

using Microsoft.Extensions.Caching.Memory;

private IMemoryCache _cache;
public MySetting GetSetting(object task){
    var setting = _cache.TryGet<MySetting>("key");
    if(setting == null){
        setting = GetSettingFromDb();
        _cache.Set<MySetting>("key", setting);
    }
	
    setting.a = task.something();
	return setting;    
}

预期是: 读取 setting 缓存后, 修改对应的配置项并返回. 并不会影响缓存中的数据.

而实际是: 修改 setting 的某些属性值后, MemoryCache 中对应的缓存值也被修改了, 即便没有重新 Set 设置缓存.

下面用一个简单的例子来验证下结果.

[Fact]
public void TestMemoryObject()
{
    var cache = new MemoryCache(new MemoryCacheOptions());

    cache.Set("user", new User { Name = "Foo" });
    var f = cache.Get<User>("user");
    Assert.Equal("Foo", f.Name);

    f.Name = "Bar";
    var b = cache.Get<User>("user");
    Assert.Equal("Bar", b.Name);
}

发现虽然没有重新设置缓存,只是修改了 f.Name 再次从缓存中读取数据时, 得到的对象已经发生了改变. 所以才导致了缓存的错乱.

这个跟我们预期的缓存是不一致的. 按理说从缓存里取出的对象应该在不重新 Set 的情况下修改原来缓存的值. 从效果来看这更像是一种普通的对象引用. 那是不是 MemoryCache 的缓存就是简单的对象赋值呢? 这让我对这个对象的实现产生了一点点好奇.

然后稍微去看下这个包的源码: 源码地址 , 重点关注一下上面两个方法的实现.

首先, 看一下这个类的描述

/// <summary>
/// An implementation of <see cref="IMemoryCache"/> using a dictionary to
/// store its entries.
/// </summary>
public class MemoryCache : IMemoryCache

好像一下子就了然了: 这个类是用了一个 dictionary 来实现 IMemoryCache 接口, 也没有继承其他类型. 这样的话, 缓存的值只是一个对象引用似乎就说得通了. 继续往下看.

用一个线程安全的字典来存储所有的缓存对象, 然后再构造函数中直接 new 出来对象, 简单粗暴:

private readonly ConcurrentDictionary<object, CacheEntry> _entries;
// We store the delegates locally to prevent allocations
// every time a new CacheEntry is created.
private readonly Action<CacheEntry> _setEntry;
private readonly Action<CacheEntry> _entryExpirationNotification;
// ...
public MemoryCache(IOptions<MemoryCacheOptions> optionsAccessor, ILoggerFactory loggerFactory)
{
    // ...
     _entries = new ConcurrentDictionary<object, CacheEntry>();
     _setEntry = SetEntry;
    _entryExpirationNotification = EntryExpired;
}

然后我们看下上面示例代码中测试代码中主要用到的两个方法: CreateEntry 和 TryGetValue

CreateEntry: 一通检查和校验之后, 返回以个新Entry对象, 其中构造函数中绑定的 _setEntry 方法会传给构造其构造方法, 在设置缓存对象时会调用, 我们顺便看下 SetEntry 方法, 这是这个类的核心方法, 看起来似乎比较绕, 但整体上就是做了下面这2件事:
- 将 entry 对象添加或更新到 _entries 字典中
- 设置 entry 的超时

public ICacheEntry CreateEntry(object key)
{
    CheckDisposed();

    ValidateCacheKey(key);

    return new CacheEntry(
        key,
        _setEntry,
        _entryExpirationNotification
    );
}

private void SetEntry(CacheEntry entry)
{
    // ...
    entry._absoluteExpiration = absoluteExpiration;
    
    // ...
    if (_entries.TryGetValue(entry.Key, out CacheEntry priorEntry))
    {
        priorEntry.SetExpired(EvictionReason.Replaced);
    }

    // ...
    if (priorEntry == null)
    {
        // Try to add the new entry if no previous entries exist.
        entryAdded = _entries.TryAdd(entry.Key, entry);
    }
    else
    {
        // Try to update with the new entry if a previous entries exist.
        entryAdded = _entries.TryUpdate(entry.Key, entry, priorEntry);
    }
}

TryGetValue 方法, 一样直接简单粗暴, 从字典里面找到对应的缓存对象, 并将之赋值给 out 参数, 返回.

public bool TryGetValue(object key, out object result)
{
	// ...
    if (_entries.TryGetValue(key, out CacheEntry entry))
    {
        // ... 
        found = true;
        entry.LastAccessed = utcNow;
        result = entry.Value;
    }

    return found;
}

所以这就很明朗了: 在这个包中, MemoryCache 实现 IMemoryCache 接口, 实现了内存缓存的逻辑, 其实现的方式就是实用一个 Dictionary 来保存每次存储的对象, 而在获取的时候, 也是从这个 Dictionary 中找到缓存对象, 并将其引用返回. 这也就解释了上面示例代码为什么会出现这种效果的原因. 而在源码的测试用例中, 也找到了类似的测试:

https://github.com/dotnet/extensions/blob/f8d203d91a132ab24e1c481e3be299b7afcbf21d/src/Caching/Memory/test/MemoryCacheSetAndRemoveTests.cs#L64-L101

其他实现呢?

既然如此, 那还有没有其他实现, 或者 IMemoryCache的实现, 能够符合我们一开始的需求呢?

System.Runtime.Caching
- 设计上稍微复杂一些, 用 MemoryCacheStore 来管理缓存对象及其对应的方法
- 用 HashTable 来存储 _entries, 并加了个锁来保证线程安全: MemoryCacheStore
- 但本质上效果还是一样, 试了一下, 效果还是一样. 具体可以看 MemoryCacheStore 中的 Get 和 Set 方法:
```
internal void Set(MemoryCacheKey key, MemoryCacheEntry entry)

internal MemoryCacheEntry Get(MemoryCacheKey key)
```
自己扩展

我们利用 String 的 不可变 特性, 将数组和对象转成字符串再存到缓存中, 待解析时再重新转回该对象. 这样就就可以保证每次拿到的都是不同的引用了, 那自然修改其中一个值便也不会影响缓存中的值了.

不过需要注意转成字符串的方法需要进行测试, 不同数据类型转换的方法也不一样, 当然也可用返现实现一些偷懒的版本. 示例代码:

public static class MemoryCacheExtensions
{
    class Entry<T>
    {
        public T Value { get; set; }
    }

    public static T MyGet<T>(this IMemoryCache cache, object key)
    {
        var res = cache.TryGetValue(key, out string entryStr);
        if (!res)
            return default;

        var entry = JsonConvert.DeserializeObject<Entry<T>>(entryStr);
        return entry.Value;
    }

    public static bool MySet<T>(this IMemoryCache cache, object key, T value)
    {
        var entry = new Entry<T> { Value = value };
        try	
        {
            cache.Set(key, JsonConvert.SerializeObject(entry));
            return true;
        }
        catch 
        { 
            return false;
        }
    }
}

使用总结

使用 MemoryCache 时建议只缓存字符串, 如果是对象, 建议序列化后再缓存.
不建议在一个程序部署多个后端的环境使用, 会造成缓存不同步. 可以改用redis或者其他独立缓存
不能用null作为缓存值

参考链接:

# csharp