使用python将GCS存储桶中的csv文件上传到远程sftp位置

问题描述

我正在尝试使用python将Google云端gcs存储桶中的csv文件发送到远程sftp位置。

import pysftp
from google.cloud import storage
from google.cloud.storage import Blob

client = storage.Client()
bucket = client.bucket("bucket_path")
blob = bucket.blob("FILE.csv")
cnopts = pysftp.Cnopts()
cnopts.hostkeys = None
with  pysftp.Connection(host='remote_server',username='user',password='password',port=22,cnopts=cnopts) as sftp:
  print("Connection succesfully established ... ")
  remote_file=sftp.open('remote_location/sample.csv','w+')
  blob.download_to_file(remote_file)

我遇到以下错误

Connection succesfully established ... 
Traceback (most recent call last):
  File "/dirvenv/lib/python3.8/site-packages/google/cloud/storage/blob.py",line 997,in download_to_file
    self._do_download(
  File "/dirvenv/lib/python3.8/site-packages/google/cloud/storage/blob.py",line 872,in _do_download
    response = download.consume(transport,timeout=timeout)
  File "/dirvenv/lib/python3.8/site-packages/google/resumable_media/requests/download.py",line 168,in consume
    self._process_response(result)
  File "/dirvenv/lib/python3.8/site-packages/google/resumable_media/_download.py",line 185,in _process_response
    _helpers.require_status_code(
  File "/dirvenv/lib/python3.8/site-packages/google/resumable_media/_helpers.py",line 106,in require_status_code
    raise common.InvalidResponse(
google.resumable_media.common.InvalidResponse: ('Request Failed with status code',404,'Expected one of',<HTTPStatus.OK: 200>,<HTTPStatus.PARTIAL_CONTENT: 206>)

在处理上述异常期间,发生了另一个异常:

Traceback (most recent call last):
  File "/dirPycharmProjects/leanplum/file_ftp.py",line 15,in <module>
    blob.download_to_file(remote_file)
  File "/dirvenv/lib/python3.8/site-packages/google/cloud/storage/blob.py",line 1008,in download_to_file
    _raise_from_invalid_response(exc)
  File "/dirvenv/lib/python3.8/site-packages/google/cloud/storage/blob.py",line 3262,in _raise_from_invalid_response
    raise exceptions.from_http_status(response.status_code,message,response=response)
google.api_core.exceptions.NotFound: 404 GET https://storage.googleapis.com/download/storage/v1/b/gs://bucket_name/o/FILE.csv?alt=media: ('Request Failed with status code',<HTTPStatus.PARTIAL_CONTENT: 206>)

Process finished with exit code 1

有什么建议吗?

解决方法

上述错误“ TypeError:预期的str,字节或os.PathLike对象,而不是SFTPFile ”表示您正在尝试下载 SFTPFile 类型的对象,并且方法download_to_filename()需要 str,字节或os.PathLike对象

我了解您的用例涉及将CSV格式的文件上传到远程SFTP位置,并且此CSV文件当前位于Cloud Storage中。

因此,我建议您首先使用以下示例从Cloud Storage存储桶中download the contents of this blob into a file-like object

from google.cloud import storage


def download_blob(bucket_name,source_blob_name,destination_file_name):
    """Downloads a blob from the bucket."""
    # bucket_name = "your-bucket-name"
    # source_blob_name = "storage-object-name"
    # destination_file_name = "local/path/to/file"

    storage_client = storage.Client()

    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(source_blob_name)
    blob.download_to_filename(destination_file_name)

    print(
        "Blob {} downloaded to {}.".format(
            source_blob_name,destination_file_name
        )
    )

然后,在本地下载此Blob的内容后,可以使用以下示例代码upload it to a remote STFP location

import pysftp

with pysftp.Connection('hostname',username='[YOUR_USERNAME]',password='[YOUR_PASSWORD]') as sftp:
  with sftp.cd('public'):             # temporarily chdir to public
     sftp.put('/my/local/filename')  # upload file to public/ on remote

有关更多示例,请参阅此Stackoverflow question