ios - 如何在后台线程(Swift)上有效地将大文件写入磁盘-6ren

ios - 如何在后台线程(Swift)上有效地将大文件写入磁盘

转载作者：行者123 更新时间：2023-12-02 05:49:43

更新

我已经解决并消除了分散注意力的错误。请阅读整篇文章，如果还有任何问题，请随时发表评论。

背景

我正在尝试使用 Swift 2.0、GCD 和完成处理程序在 iOS 上将相对较大的文件(视频)写入磁盘。我想知道是否有更有效的方法来执行此任务。任务需要在不阻塞主 UI 的情况下完成，同时使用完成逻辑，并确保操作尽快发生。我有一个带有 NSData 属性的自定义对象，所以我目前正在尝试在 NSData 上使用扩展。例如，替代解决方案可能包括使用 NSFilehandle 或 NSStreams 以及某种形式的线程安全行为，这会导致比我当前解决方案所基于的 NSData writeToURL 函数更快的吞吐量。

NSData 有什么问题？

请注意以下来自 NSData 类引用的讨论，( Saving Data )。我确实对我的临时目录执行了写入，但是我遇到问题的主要原因是我在处理大文件时可以看到 UI 中出现明显的延迟。这种滞后正是因为 NSData 不是异步的(Apple Docs 指出原子写入会导致“大”文件的性能问题 ~ > 1mb)。因此，在处理大文件时，NSData 方法中的任何内部机制都在起作用。

我做了一些更多的挖掘，并从 Apple 找到了这个信息......“这种方法非常适合将 data://URL 转换为 NSData 对象，也可以用于同步读取 短文件。如果您需要潜在读取大文件 ，使用 inputStreamWithURL: 打开一个流，然后一次读取一个文件。” ( NSData Class Reference, Objective-C, +dataWithContentsOfURL )。此信息似乎暗示，如果将 writeToURL 移动到后台线程(如@jtbandes 所建议)是不够的，我可以尝试使用流在后台线程上写出文件。

The NSData class and its subclasses provide methods to quickly and easily save their contents to disk. To minimize the risk of data loss, these methods provide the option of saving the data atomically. Atomic writes guarantee that the data is either saved in its entirety, or it fails completely. The atomic write begins by writing the data to a temporary file. If this write succeeds, then the method moves the temporary file to its final location.

While atomic write operations minimize the risk of data loss due to corrupt or partially-written files, they may not be appropriate when writing to a temporary directory, the user’s home directory or other publicly accessible directories. Any time you work with a publicly accessible file, you should treat that file as an untrusted and potentially dangerous resource. An attacker may compromise or corrupt these files. The attacker can also replace the files with hard or symbolic links, causing your write operations to overwrite or corrupt other system resources.

Avoid using the writeToURL:atomically: method (and the related methods) when working inside a publicly accessible directory. Instead initialize an NSFileHandle object with an existing file descriptor and use the NSFileHandle methods to securely write the file.

其他替代品

一 article在 objc.io 的并发编程上提供了有关“高级:后台文件 I/O”的有趣选项。一些选项也涉及使用 InputStream。 Apple 也有一些对 reading and writing files asynchronously 的旧引用.我发布这个问题是为了期待 Swift 替代品。

适当答案的示例

以下是可能满足此类问题的适当答案的示例。 (取自流编程指南， Writing To Output Streams)

使用 NSOutputStream 实例写入输出流需要几个步骤:

创建并初始化一个 NSOutputStream 实例
写入数据的存储库。还设置了一个委托(delegate)。

安排
运行循环上的流对象并打开流。

处理事件
流对象向其委托(delegate)报告。

如果流对象
已将数据写入内存，通过请求获取数据
NSStreamDataWrittenToMemoryStreamKey 属性。

当没有更多
要写入的数据，处理流对象。

I am looking for the most proficient algorithm that applies to writing extremely large files to iOS using Swift, APIs, or possibly even C/ObjC would suffice. I can transpose the algorithm into appropriate Swift compatible constructs.

Nota Bene

~~I understand the informational error below. It is included for completeness.~~ This question is asking whether or not there is a better algorithm to use for writing large files to disk with a guaranteed dependency sequence (e.g. NSOperation dependencies). If there is please provide enough information (description/sample for me to reconstruct pertinent Swift 2.0 compatible code). Please advise if I am missing any information that would help answer the question.

分机注意事项

I've added a completion handler to the base writeToURL to ensure that no unintended resource sharing occurs. My dependent tasks that use the file should never face a race condition.

extension NSData {

    func writeToURL(named:String, completion: (result: Bool, url:NSURL?) -> Void)  {

       let filePath = NSTemporaryDirectory() + named
       //var success:Bool = false
       let tmpURL = NSURL( fileURLWithPath:  filePath )
       weak var weakSelf = self


      dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), {
                //write to URL atomically
                if weakSelf!.writeToURL(tmpURL, atomically: true) {

                        if NSFileManager.defaultManager().fileExistsAtPath( filePath ) {
                            completion(result: true, url:tmpURL)                        
                        } else {
                            completion (result: false, url:tmpURL)
                        }
                    }
            })

        }
    }

此方法用于使用以下方法处理来自 Controller 的自定义对象数据:

var items = [AnyObject]()
if let video = myCustomClass.data {

    //video is of type NSData        
    video.writeToURL("shared.mp4", completion: { (result, url) -> Void in
        if result {
            items.append(url!)
            if items.count > 0 {

                let sharedActivityView = UIActivityViewController(activityItems: items, applicationActivities: nil)

                self.presentViewController(sharedActivityView, animated: true) { () -> Void in
                //finished
    }
}
        }
     })
}

结论

Apple 文档位于 Core Data Performance提供一些关于处理内存压力和管理 BLOB 的好建议。这确实是一篇文章，其中包含大量行为线索以及如何缓解应用程序中大文件的问题。现在，虽然它特定于 Core Data 而不是文件，但关于原子写入的警告确实告诉我，我应该非常小心地实现以原子方式写入的方法。

对于大文件，管理写入的唯一安全方法似乎是添加完成处理程序(到 write 方法)并在主线程上显示事件 View 。是使用流还是通过修改现有 API 来添加完成逻辑取决于读者。我过去已经完成了这两项工作，并且正在测试以获得最佳性能。

在那之前，我正在更改解决方案以从 Core Data 中删除所有二进制数据属性，并用字符串替换它们以将 Assets URL 保存在磁盘上。我还利用 Assets 库和 PHAsset 的内置功能来获取和存储所有相关 Assets URL。当或如果我需要复制任何 Assets ，我将使用标准 API 方法(PHAsset/ Assets 库上的导出方法)和完成处理程序来通知用户主线程上的完成状态。

(核心数据性能文章中非常有用的片段)

Reducing Memory Overhead

It is sometimes the case that you want to use managed objects on a temporary basis, for example to calculate an average value for a particular attribute. This causes your object graph, and memory consumption, to grow. You can reduce the memory overhead by re-faulting individual managed objects that you no longer need, or you can reset a managed object context to clear an entire object graph. You can also use patterns that apply to Cocoa programming in general.

You can re-fault an individual managed object using NSManagedObjectContext’s refreshObject:mergeChanges: method. This has the effect of clearing its in-memory property values thereby reducing its memory overhead. (Note that this is not the same as setting the property values to nil—the values will be retrieved on demand if the fault is fired—see Faulting and Uniquing.)

When you create a fetch request you can set includesPropertyValues to NO > to reduce memory overhead by avoiding creation of objects to represent the property values. You should typically only do so, however, if you are sure that either you will not need the actual property data or you already have the information in the row cache, otherwise you will incur multiple trips to the persistent store.

You can use the reset method of NSManagedObjectContext to remove all managed objects associated with a context and "start over" as if you'd just created it. Note that any managed object associated with that context will be invalidated, and so you will need to discard any references to and re-fetch any objects associated with that context in which you are still interested. If you iterate over a lot of objects, you may need to use local autorelease pool blocks to ensure temporary objects are deallocated as soon as possible.

If you do not intend to use Core Data’s undo functionality, you can reduce your application's resource requirements by setting the context’s undo manager to nil. This may be especially beneficial for background worker threads, as well as for large import or batch operations.

Finally, Core Data does not by default keep strong references to managed objects (unless they have unsaved changes). If you have lots of objects in memory, you should determine the owning references. Managed objects maintain strong references to each other through relationships, which can easily create strong reference cycles. You can break cycles by re-faulting objects (again by using the refreshObject:mergeChanges: method of NSManagedObjectContext).

Large Data Objects (BLOBs)

If your application uses large BLOBs ("Binary Large OBjects" such as image and sound data), you need to take care to minimize overheads. The exact definition of “small”, “modest”, and “large” is fluid and depends on an application’s usage. A loose rule of thumb is that objects in the order of kilobytes in size are of a “modest” sized and those in the order of megabytes in size are “large” sized. Some developers have achieved good performance with 10MB BLOBs in a database. On the other hand, if an application has millions of rows in a table, even 128 bytes might be a "modest" sized CLOB (Character Large OBject) that needs to be normalized into a separate table.

In general, if you need to store BLOBs in a persistent store, you should use an SQLite store. The XML and binary stores require that the whole object graph reside in memory, and store writes are atomic (see Persistent Store Features) which means that they do not efficiently deal with large data objects. SQLite can scale to handle extremely large databases. Properly used, SQLite provides good performance for databases up to 100GB, and a single row can hold up to 1GB (although of course reading 1GB of data into memory is an expensive operation no matter how efficient the repository).

A BLOB often represents an attribute of an entity—for example, a photograph might be an attribute of an Employee entity. For small to modest sized BLOBs (and CLOBs), you should create a separate entity for the data and create a to-one relationship in place of the attribute. For example, you might create Employee and Photograph entities with a one-to-one relationship between them, where the relationship from Employee to Photograph replaces the Employee's photograph attribute. This pattern maximizes the benefits of object faulting (see Faulting and Uniquing). Any given photograph is only retrieved if it is actually needed (if the relationship is traversed).

It is better, however, if you are able to store BLOBs as resources on the filesystem, and to maintain links (such as URLs or paths) to those resources. You can then load a BLOB as and when necessary.

注:

I've moved the logic below into the completion handler (see the code above) and I no longer see any error. As mentioned before this question is about whether or not there is a more performant way to process large files in iOS using Swift.

当尝试处理生成的 items 数组以传递给 UIActvityViewController 时，使用以下逻辑:

如果 items.count > 0 {
让 sharedActivityView = UIActivityViewController(activityItems: items, applicationActivities: nil)
self.presentViewController(sharedActivityView, animation: true) { () -> Void in
//完成的}
}

我看到以下错误:通信错误:{ count = 1,
内容 = "XPCErrorDescription"=> { 长度 =
22、contents = "Connection interrupted"} }> (请注意，我正在寻找更好的设计，而不是这个错误信息的答案)

最佳答案

性能取决于数据是否适合 RAM。如果是，那么你应该使用 NSData writeToURL与 atomically功能打开，这就是你在做什么。
Apple 关于“写入公共(public)目录”的危险的注释在 iOS 上完全无关，因为没有公共(public)目录。该部分仅适用于 OS X。坦率地说，这也不是很重要。
因此，只要视频适合 RAM(大约 100MB 是安全限制)，您编写的代码就会尽可能高效。
对于不适合 RAM 的文件，您需要使用流，否则您的应用程序将在将视频保存在内存中时崩溃。要从服务器下载大视频并将其写入磁盘，您应该使用 NSURLSessionDownloadTask .
通常，流式传输(包括 NSURLSessionDownloadTask )将比 NSData.writeToURL() 慢几个数量级.所以除非你需要，否则不要使用流。 NSData上的所有操作速度非常快，它完全有能力处理数 TB 大小的文件，并在 OS X 上具有出色的性能(iOS 显然不能拥有那么大的文件，但它是具有相同性能的同一类)。

您的代码中存在一些问题。
这是错误的:

let filePath = NSTemporaryDirectory() + named

而是总是这样做:

let filePath = NSTemporaryDirectory().stringByAppendingPathComponent(named)

但这也不理想，您应该避免使用路径(它们有问题且速度慢)。而是使用这样的 URL:

let tmpDir = NSURL(fileURLWithPath: NSTemporaryDirectory())!
let fileURL = tmpDir.URLByAppendingPathComponent(named)

此外，您正在使用路径来检查文件是否存在......不要这样做:

if NSFileManager.defaultManager().fileExistsAtPath( filePath ) {

而是使用 NSURL检查它是否存在:

if fileURL.checkResourceIsReachableAndReturnError(nil) {

关于ios - 如何在后台线程(Swift)上有效地将大文件写入磁盘，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/31965566/

文章推荐： java - 导出到 .jar 后无法解析内容(需要 XML)

文章推荐： interop - 多种语言如何在一个项目中交互？

文章推荐： breeze - 包括未映射到 Entity Framework 实体的服务器端属性

文章推荐： java - Set.value 用新数据替换现有数据

前端体验优化（5）——后台
　　从 0 开始搭建一套后台管理系统，成本巨大，所以都会选择一套成熟的组件库，基于此，再堆叠业务逻辑。我们公司的组件库基于 Ant Design。Ant Design 包含一套完整的后台解决方案，不仅
IOS内存管理与应用前台/后台
在我的 IOS 应用程序中，我有一个标记为 retain 的 NSDate* 属性当我的应用程序再次激活时，属性值已被释放。我是否误解了属性和内存管理的工作原理，我该如何防范？最佳答案很明显，
C#，后台 worker
我有一个使用 BackgroundWorker 组件的示例 WinForms 应用程序。它工作正常，但是当我点击 Cancel 按钮取消后台线程时，它并没有取消线程。当我点击 Cancel 按钮调用
后台 iOS 通知
我目前正在开发一个应用程序，该应用程序在启动时会对服务器执行 ping 操作，该服务器会为每个连接的设备返回一个唯一标识符。设备每 5 秒从服务器检索另一页以获取一组不同的数据。这个唯一的 ID 可以
iOS:后台/前台事件
我正在开发一个应用程序，当它通过主页按钮在后台按下时，计时器应该启动，当应用程序返回前台并且计时器已经过了一定时间时，应该是执行。我的问题是当我的应用程序转到背景/前景？是否有特殊的方法或其他技
iOS 后台 MKPointAnnotation
我有 map View ，其中几乎没有 MKPointAnnotation。一切正常，但是， View 的 MKPoiintAnnotation 的“背景”是“不可见的”，因此不是很“可见”。我想
后台 iOS 广告信标
我在 iOS 中开发广告数据应用程序。我的应用程序广告数据在前台很好。但我想在 ios 后台宣传信标数据。我设置了背景外设设置。和广告数据 advertisingData = [CBAdvertise
c# - 后台 worker
如果我有一组操作，我想根据特定条件在后台工作程序中运行，例如，我有 10 个条件 if(a) BackgroundWorker doA = new backgroundworker() if(
后台 Python 函数
我想独立运行一个函数。从我调用的函数中，我想在不等待其他函数结束的情况下返回。我试过用 threadind，但这会等待，结束。 thread = threading.Thread(target=my
后台 IOS 可达性通知
我想在用户在线时立即执行一些任务，即使他在后台也是如此。我正在使用 Reachability 类来检查互联网。但是当我在后台时，这个类没有通知我。我知道有人早些时候问过这个问题，但没有找到任何解决方案
后台 iOS 文字转语音
我在后台播放文本转语音时出现间歇性(哎呀!)问题，由 Apple Watch 触发。我已经正确设置了后台模式、AVSession 类别和 WatchKitExtensionRequest 处理程序。
C# - 后台 worker ？
我有一个相当复杂的程序，所以我不会在这里转储整个程序。这是一个简化版本: class Report { private BackgroundWorker worker; public
C#后台 worker
我有一个任务在 backgroundworker 中运行。单击开始按钮，用户将启动该过程，并获得一个取消按钮来取消处理。当用户点击取消时，我想显示一个消息框“进程尚未完成，你想继续吗”。这里我希望
ruby - 后台/守护进程
我有一个按以下方式编码的脚本。我想将它作为后台/守护进程运行，但是一旦我启动脚本，如果我关闭它从程序运行的终端窗口终止。我需要做什么来保持程序运行 loop do pid = fork do
android - 后台 Activity 识别
我正在制作一个使用 ActivityRecognition API 在后台跟踪用户 Activity 的应用，如果用户在指定时间段(例如 1 小时)内停留在同一个地方，系统就会推送通知告诉用户去散步.
swift - 后台 URLSession + Combine？
当尝试使用 URLSession 的 dataTaskPublisher 方法发送后台请求时: URLSession(configuration: URLSessionConfiguration.ba
C#，后台 worker 类
当我编译这段代码时，我得到了他的错误，对象引用设置为null，错误位置在Dowork中，argumenttest.valueone = 8; public partial class Form1 :
objective-c - 使用不活动的应用程序(后台)
有什么方法可以使用最小化或不活动的应用程序吗？我可以打开我的应用程序，然后打开并使用另一个应用程序，然后按一个按钮来激活我的程序吗？例如，打开我的应用程序，打开 Safari，按下按钮(F1 或任何
iphone - 后台 iOS 应用程序是否会收到显示屏即将进入休眠状态的通知？
我的具体要求是一个在后台运行的应用程序，被通知显示器即将进入休眠状态或者设备已经或即将达到空闲超时 - 然后唤醒并执行一些(简短的)一段代码。我在这里找到了有关应用程序被置于后台或暂停的通知的引用:
xcode - Cocoa - 以编程方式转到前台/后台
我有一个 LSUIElement 设置为 1 的应用程序。它有一个内置编辑器，因此我希望该应用程序在编辑器打开时出现在 Cmd+Tab 循环中。 -(void)stepIntoForegrou

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

ios - 如何在后台线程(Swift)上有效地将大文件写入磁盘