如何使用c#中的Octokit.Net Git Data API从GitHub上/向GitHub上大于1MB的文件进行检索和更新

问题描述

我正在尝试使用Octokit.Net读取和更新存储库中的单个文件

我要读取/更新的特定文件的大小约为2.1MB,因此当我尝试使用以下代码读取该文件时...

var currentFileText = "";

            var contents = await client.Repository.Content.GetAllContentsByRef("jkears","NextWare.ProductPortal","domainModel.ddd","master");
            var targetFile = contents[0];
            if (targetFile.EncodedContent != null)
            {
                currentFileText = Encoding.UTF8.GetString(Convert.FromBase64String(targetFile.EncodedContent));
            }
            else
            {
                currentFileText = targetFile.Content;
            }

我得到这个例外。

Octokit.ForbiddenException
  HResult=0x80131500
  Message=This API returns blobs up to 1 MB in size. The requested blob is too large to fetch via the API,but you can use the Git Data API to request blobs up to 100 MB in size.

我的问题是如何在c#中使用Git Data API读取此大文件内容,进一步如何将对此文件的更改更新回同一存储库中?

解决方法

不是很难,但不是很明显。

我尝试读取/更新的文件为2.4 Mb,而我能够将该文件压缩至512K(使用SevenZip),这使我可以在回购中读取/更新,我想读取/更新1Mb以上的文件

要实现此目的,我必须使用GitHub的GraphQL API。我需要这样做,以便为我感兴趣的读取/更新的特定文件检索SHA1。

从未使用过Git API或就此而言的GraphQL,我选择使用GraphQL客户端(GraphQL.Client和GraphQL.Client.Serializer.Newtonsoft)。

使用GraphQL,我可以在GitHub Repo中检索现有文件/ blob的SHA-1 ID。有了SHA-1的Blob之后,我就可以通过GIT Data API轻松提取有问题的文件。

然后,我能够更改内容,并通过Octokit.Net将更改推回到GitHub。

虽然这个方法丝毫没有修饰,但我想用其他东西来关闭它。

贷记到以下stackover flow thread

public async Task<string> GetSha1(string owner,string personalToken,string repositoryName,string pathName,string branch = "master")
        {
            string basicValue = Convert.ToBase64String(Encoding.UTF8.GetBytes($"{owner}:{personalToken}"));

            var graphQLClient = new GraphQLHttpClient("https://api.github.com/graphql",new NewtonsoftJsonSerializer());
            graphQLClient.HttpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic",basicValue);

            var getShaRequest = new GraphQLRequest
            {
                Query = @"
                    query {
                      repository(owner: """+owner+@""",name: """+ repositoryName +@""") {
                        object(expression: """ + branch + @":" + pathName +@""") {
                                            ... on Blob {
                                            oid
                                        }
                                    }
                                }
                            }",Variables = new
                    {
                    }
            };

            var graphQLResponse = await graphQLClient.SendQueryAsync<ResponseType>(getShaRequest,cancellationToken: CancellationToken.None);
            return graphQLResponse.Data.Repository.Object.Oid;
        }

这是我的助手课

public class ContentResponseType
        {
            public string content { get; set; }
            public string encoding { get; set; }
            public string url { get; set; }
            public string sha { get; set; }
            public long size { get; set; }
        }

        public class DataObject
        {
            public string Oid;
        }

        public class Repository
        {
            public DataObject Object;
        }

        public class ResponseType
        {
            public Repository Repository { get; set; }
        }

这是文件,该文件使用上述方法提供的SHA-1来检索内容。

 public async Task<ContentResponseType> RetrieveFileAsync(string owner,string branch = "master")
        {
            var sha1 = await this.GetSha1(owner: owner,personalToken: personalToken,repositoryName: repositoryName,pathName: pathName,branch: branch);
            var url = this.GetBlobUrl(owner,repositoryName,sha1);
            var req = this.BuildRequestMessage(url,personalToken);
            using (var httpClient = new HttpClient())
            {
                var resp = await httpClient.SendAsync(req);
                if (resp.StatusCode != System.Net.HttpStatusCode.OK)
                {
                    throw new Exception($"error happens when downloading the {req.RequestUri},statusCode={resp.StatusCode}");
                }
                using (var ms = new MemoryStream())
                {
                    await resp.Content.CopyToAsync(ms);
                    ms.Seek(0,SeekOrigin.Begin);
                    StreamReader reader = new StreamReader(ms);
                    var jsonString =  reader.ReadToEnd();
                    return System.Text.Json.JsonSerializer.Deserialize<ContentResponseType>(jsonString);
                }
            }
        }

这是我的控制台测试应用程序...

    static async Task Main(string[] args)
    {

        // GitHub variables
        var owner = "{Put Owner Name here}";
        var personalGitHubToken = "{Put your Token here}";
        var repo = "{Put Repo Name Here}";
        var branch = "master";
        var referencePath = "{Put path and filename here}";

        // Get the existing Domain Model file
        var api = new GitHubRepoApi();
        var response = await api.RetrieveFileAsync(owner:owner,personalToken: personalGitHubToken,repositoryName: repo,pathName: referencePath,branch:branch);
        var currentFileText = Encoding.UTF8.GetString(Convert.FromBase64String(response.content));

        // Change the description of the JSON Domain Model
        currentFileText = currentFileText.Replace(@"""description"":""SubDomain",@"""description"":""Domain");
        
        // Update the changes back to GitHub repo using Octokit
        var client = new GitHubClient(new Octokit.ProductHeaderValue(repo));
        var tokenAuth = new Credentials(personalGitHubToken);
        client.Credentials = tokenAuth;
        
        // Read back the changes to confirm all works
        var updateChangeSet = await client.Repository.Content.UpdateFile(owner,repo,referencePath,new UpdateFileRequest("Domain Model was updated via automation",currentFileText,response.sha,branch));
         
        response = await api.RetrieveFileAsync(owner: owner,branch: branch);
        currentFileText = Encoding.UTF8.GetString(Convert.FromBase64String(response.content));
    }

我敢肯定还有很多其他方法可以做到这一点,但这对我很有用,我希望这有助于使别人的生活更轻松。

欢呼 约翰