使用 Deedle 重复键重新采样 编辑:

问题描述

我下面的代码将每日利润统计数据从 5 分钟间隔重新采样到 1 天间隔。问题是 BacktestResult 包含重复的 CloseDate 值,因为我正在测试多个对(TRXUSDT、ETHUSDT 和 BTCUSDT)。 dailyProfit 返回 Series<DateTime,double>,它解释了异常。我怎样才能让它按对或其他东西分组?用一对测试时效果很好。

// Create series
var series = _backtestResults.Toordinalseries();

// daily_profit = results.resample('1d',on = 'close_date')['profit_percent'].sum()
var dailyProfit = series.ResampleEquivalence(
    index => new DateTime(series[index].CloseDate.Year,series[index].CloseDate.Month,series[index].CloseDate.Day,DateTimeKind.Utc),group => group.SelectValues(g => g.ProfitPercentage).Sum()).DropMissing();

// classes
public class BacktestResult
{
    public string Pair { get; set; }
    public decimal ProfitPercentage { get; set; }
    public decimal ProfitAbs { get; set; }
    public decimal OpenRate { get; set; }
    public decimal CloseRate { get; set; }
    public DateTime OpenDate { get; set; }
    public DateTime CloseDate { get; set; }
    public decimal OpenFee { get; set; }
    public decimal CloseFee { get; set; }
    public decimal Amount { get; set; }
    public decimal TradeDuration { get; set; }
    public SellType SellReason { get; set; }
}

编辑:

从 pastebin 获取 JSON 数据的示例:

using deedle;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;

namespace Resample
{
    class Program
    {
        public class BacktestResultTest
        {
            public string Pair { get; set; }
            public decimal ProfitPercentage { get; set; }
            public decimal ProfitAbs { get; set; }
            public decimal OpenRate { get; set; }
            public decimal CloseRate { get; set; }
            public DateTime OpenDate { get; set; }
            public DateTime CloseDate { get; set; }
            public decimal OpenFee { get; set; }
            public decimal CloseFee { get; set; }
            public decimal Amount { get; set; }
            public decimal TradeDuration { get; set; }
            public bool OpenAtEnd { get; set; }
            public int SellReason { get; set; }
        }

        static void Main(string[] args)
        {
            // Take JSON data from pastebin
            using var webClient = new WebClient();
            var json = webClient.DownloadString("https://pastebin.com/raw/Dhp9202f");

            // Deserialize the data
            var data = JsonConvert.DeserializeObject<List<BacktestResultTest>>(json);

            var ts = data.Toordinalseries();

            var byDateAndPair = ts.SelectKeys(kvp => Tuple.Create(kvp.Value.Value.CloseDate,kvp.Value.Value.Pair)).sortByKey();

            // daily_profit = results.resample('1d',on = 'close_date')['profit_percent'].sum()
            var dailyProfit2 = byDateAndPair.ResampleEquivalence(
                k => Tuple.Create(new DateTime(k.Item1.Year,k.Item1.Month,k.Item1.Day),k.Item2),g => g.Select(kvp => kvp.Value.ProfitPercentage).Sum());

            // backtest_worst_day = min(daily_profit)
            var worstDay2 = dailyProfit2.Min();
            // backtest_best_day = max(daily_profit)
            var bestDay2 = dailyProfit2.Max();
            // winning_days = sum(daily_profit > 0)
            var winningDays2 = dailyProfit2.SelectValues(x => x > 0).Sum();
            // draw_days = sum(daily_profit == 0)
            var drawDays2 = dailyProfit2.SelectValues(x => x == 0).Sum();
            // losing_days = sum(daily_profit < 0)
            var losingDays2 = dailyProfit2.SelectValues(x => x < 0).Sum();

            Console.ReadLine();
        }
    }
}

解决方法

您可以使用自定义数据类型作为 Deedle 中的键。如果您希望能够在系列上使用重采样,那么这需要支持 IComparable。您可以定义自己的类型或使用内置的 Tuple

假设我们有一些非常基本的数据:

var ts =
  new[] {
    KeyValue.Create(new DateTime(2020,1,1),new { Value = 1.0,Kind = "A" }),KeyValue.Create(new DateTime(2020,2),3),Kind = "B" }),4),}.ToSeries();

我们需要做的第一件事是将密钥更改为日期和种类。 (事实上​​,如果您有重复的日期,您可能会在代码的早期遇到麻烦!)

var byDateAndKind =
  ts.SelectKeys(kvp => Tuple.Create(kvp.Key,kvp.Value.Value.Kind)).SortByKey();

现在键是 Tuple<DateTime,string>,由日期和种类组成。您现在可以对此使用 ResampleEquivalence。这里,我们使用 year 和 kind 作为 group 中新的 key 和 sum 值:

var aggByYearAndKind = 
  byDateAndKind.ResampleEquivalence(
  (k) => Tuple.Create(k.Item1.Year,k.Item2),(g) => g.Select(kvp => kvp.Value.Value).Sum());

aggByYearAndKind.Print();

这将打印一个将 2020,"A" 映射到 2 并将 2020,"B" 映射到 2 的系列。

编辑您说得对 - 这似乎不起作用。我能够使用 GroupBy 而不是 ResampleEquvialence 让它工作:

var dailyProfit2 =
  ts.GroupBy(kvp =>
    new { Date = new DateTime(kvp.Value.CloseDate.Year,kvp.Value.CloseDate.Month,kvp.Value.CloseDate.Day),Kind = kvp.Value.Pair })
    .SelectValues(g => g.Select(kvp => kvp.Value.ProfitPercentage).Values.Sum());
      
// backtest_worst_day = min(daily_profit)
var worstDay2 = dailyProfit2.Min();
// backtest_best_day = max(daily_profit)
var bestDay2 = dailyProfit2.Max();
// winning_days = sum(daily_profit > 0)
var winningDays2 = dailyProfit2.Where(x => x.Value > 0).Values.Sum();
// draw_days = sum(daily_profit == 0)
var drawDays2 = dailyProfit2.Where(x => x.Value == 0).Values.Sum();
// losing_days = sum(daily_profit < 0)
var losingDays2 = dailyProfit2.Where(x => x.Value < 0).Values.Sum();