问题描述
我在一个Xamarin.Android
应用中执行了长时间的视频任务;它使用MediaPlayer
解码视频文件到自定义OpenGL ES Surface
中,然后将另一个Surface
放入队列,使用MediaCodec
编码数据并排入ByteBuffer
,然后根据编码器输出Mediamuxer
的可用性将其传递到ByteBuffer
中。操作良好且快速,直到视频文件的写入字节总数超过1.3GB为止,此时视频(但不是音频)将锁定。
该应用似乎包含太多GREF,因为我正在观察它们的实时波动,直到它们最终远高于46000 GREF。似乎操作系统(或应用程序?)在通过GC转储所有GREF时遇到问题,这导致应用程序陷入视频处理的中间。我正在监视android资源,总可用内存在OS级别上从未发生太大变化; cpu似乎也总是有足够的空闲余量(〜28%)。
我正在输出到系统控制台,并使用以下命令查看gref输出日志记录:
adb shell setprop debug.mono.log gref
大约14分钟后,垃圾收集似乎无法跟上。 GREF计数先升后降。最终,它变得如此之高,以致GREF计数保持在46k以上,并出现以下消息循环:
09-26 15:07:11.613 I/monodroid-gc(11213): 46111 outstanding GREFs. Performing a full GC!
09-26 15:07:11.898 I/zygote64(11213): Explicit concurrent copying GC freed 9(32KB) AllocSpace objects,0(0B) LOS objects,70% free,2MB/8MB,paused 434us total 63.282ms
09-26 15:07:13.470 D/Mono (11213): GC_TAR_BRIDGE bridges 22974 objects 23013 opaque 1 colors 22974 colors-bridged 22974 colors-visible 22974 xref 1 cache-hit 0 cache-semihit 0 cache-miss 0 setup 3.40ms tarjan 25.53ms scc-setup 14.85ms gather-xref 1.76ms xref-setup 0.50ms cleanup 13.81ms
09-26 15:07:13.470 D/Mono (11213): GC_BRIDGE: Complete,was running for 1798.94ms
09-26 15:07:13.470 D/Mono (11213): GC_MAJOR: (user request) time 54.95ms,stw 57.82ms los size: 5120K in use: 1354K
09-26 15:07:13.470 D/Mono (11213): GC_MAJOR_SWEEP: major size: 7648K in use: 6120K
和GREF日志如下所示...除了乘以成千上万。我可以看到这个数字先升后降,然后再降,升,再降,直到数量巨大,最终远远超过46k,似乎该应用程序(或操作系统?)放弃了尝试清除这些GREF的操作
grefc 38182 gwrefc 5
38182是一个数量先升后降,直到超过46k的数字
09-30 22:42:11.013 I/monodroid-gref(20765): -g- grefc 38182 gwrefc 51 handle 0x98156/G from thread 'finalizer'(25420)
09-30 22:42:11.013 I/monodroid-gref(20765): +w+ grefc 38181 gwrefc 52 obj-handle 0x980f6/G -> new-handle 0xbc3/W from thread 'finalizer'(25420)
09-30 22:42:11.013 I/monodroid-gref(20765): -g- grefc 38181 gwrefc 52 handle 0x980f6/G from thread 'finalizer'(25420)
GC系统warning: not replacing prevIoUs registered handle 0x30192 with handle 0x62426 for key_handle 0x9b1ac32
10-03 13:15:25.453 I/monodroid-gref(22127): +g+ grefc 24438 gwrefc 0 obj-handle 0x9/I -> new-handle 0x62416/G from thread 'Thread Pool Worker'(44)
10-03 13:15:25.476 I/monodroid-gref(22127): +g+ grefc 24439 gwrefc 0 obj-handle 0x30192/I -> new-handle 0x62426/G from thread 'Thread Pool Worker'(44)
10-03 13:15:25.477 I/monodroid-gref(22127): warning: not replacing prevIoUs registered handle 0x30192 with handle 0x62426 for key_handle 0x9b1ac32
10-03 13:15:25.483 I/monodroid-gref(22127): +g+ grefc 24440 gwrefc 0 obj-handle 0x9/I -> new-handle 0x62436/G from thread 'Thread Pool Worker'(44)
此外,似乎在运行垃圾回收时视频冻结,即使发生这种情况时也不会陷入循环。这是我正在寻找提示或答案的另一个问题。
// Even if we don't access the SurfaceTexture after the constructor returns,we
// still need to keep a reference to it. The Surface doesn't retain a reference
// at the Java level,so if we don't either then the object can get GCed,which
// causes the native finalizer to run.
我认为这是我遇到的问题的关键,但令我感到困惑的是,如果垃圾收集无法运行,应用程序应该如何继续编码。我在GREF日志中看到了很多:
10-03 13:07:04.897 I/monodroid-gref(22127): +g+ grefc 6472 gwrefc 4825 obj-handle 0x3727/W -> new-handle 0x2982a/G from thread 'finalizer'(24109)
那么这个GREF日志条目是否表示我需要终结器才能完成?还是表明我不应该允许终结器在视频完成编码之前运行 ?
我对此做了一些阅读,并签出了执行相同类型操作的Java代码。那时,我尝试将WeakReference
添加到父类中。视频编码和弱引用似乎相距甚远,但最终仍会锁定。
private void setup() {
_textureRender = new TextureRender();
_textureRender.SurfaceCreated();
// Even if we don't access the SurfaceTexture after the constructor returns,which
// causes the native finalizer to run.
_surfaceTexture = new SurfaceTexture(_textureRender.TextureId);
Parent.WeakSurfaceTexture.FrameAvailable += FrameAvailable; // notice the Weak references here
_surface = new Surface(Parent.WeakSurfaceTexture);
}
这是我获取弱父母参考的方式:
public System.WeakReference weakParent;
private OutputSurface Parent {
get {
if (weakParent == null || !weakParent.IsAlive)
return null;
return weakParent.Target as OutputSurface;
}
}
public SurfaceTexture WeakSurfaceTexture {
get { return Parent.SurfaceTexture; }
}
当应用实际锁定在GC循环中时,它就卡在了这个
上 var curdisplay = EGLContext.EGL.JavaCast<IEGL10>().EglGetCurrentdisplay();
在这种情况下:
const int TIMEOUT_MS = 20000;
public bool AwaitNewImage(bool returnOnFailure = false) {
System.Threading.Monitor.Enter (_frameSyncObject);
while (!IsFrameAvailable) {
try {
// Wait for onFrameAvailable() to signal us. Use a timeout to avoid
// stalling the test if it doesn't arrive.
System.Threading.Monitor.Wait (_frameSyncObject,TIMEOUT_MS);
if (!IsFrameAvailable) {
if (returnOnFailure) {
return false;
}
// Todo: if "spurIoUs wakeup",continue while loop
//throw new RuntimeException ("frame wait timed out");
}
} catch (InterruptedException ie) {
if (returnOnFailure) {
return false;
}
// shouldn't happen
//throw new RuntimeException (ie);
} catch (Exception ex) { throw ex; }
}
IsFrameAvailable = false;
System.Threading.Monitor.Exit (_frameSyncObject);
//the app is locking up on the next line:
var curdisplay = EGLContext.EGL.JavaCast<IEGL10>().EglGetCurrentdisplay();
_textureRender.CheckGlError ("before updateTexImage");
Parent.WeakSurfaceTexture.UpdateTexImage ();
return true;
}
那么这是我需要阻止终结器运行的问题吗?还是终结器导致过多的GREF的问题?在继续处理视频之前,是否需要处理其中的某些帧渲染SurfaceTexture
?在继续读/写过程之前,需要暂停MediaPlayer
并转储所有这些引用吗?
我是否需要以某种方式优化代码?我读到如果过多的java.lang.Object
实例化或用法会导致GREF溢出(或类似的东西?)。我检查了一下代码,找不到从java.lang.Object
继承的任何内容,该内容正在此循环中运行。
还是我要离开,还有别的吗?
我基本上只是想弄清楚如何在GC循环中解决视频编码器锁定问题。任何指针或寻找的东西将不胜感激。我还注意到,垃圾收集(发生时)似乎使框架短暂停顿,所以我也想解决这个问题。
这是完整的代码库:
请告知
编辑:我刚刚注意到我发布的分支继承自OutputSurface类的java.lang.Object。我删除了此文件,然后再次推送了该分支。我有一堆分支试图使其正常工作,而我已经回溯到仍然从该类继承的分支。我知道在以前的许多尝试中,我已从项目中删除了所有java.lang.Object继承,但仍将其锁定在GC上。
更新:当我在上面的分支中运行代码时,我看不到GREF超过46k,但是视频似乎仍然锁定在垃圾回收上。只是现在视频处理实际上已完成,并且GREF计数仍确实接近46k。我认为对于一段很长的视频,它的数量仍然会超过46k,因为随着视频的不断处理,计数不断增加。
解决方法
事实证明,我要做的就是注释掉我提到的可疑行:
var curDisplay = EGLContext.EGL.JavaCast<IEGL10>().EglGetCurrentDisplay();
它在一个循环中运行,被调用数千次才能播放完整的视频。
必须发生的是这些EGLDisplay
实例(var
)未被正确地垃圾收集。我以为方法完成后会自动收集它们,但是有什么阻止了这种情况的发生。如果您对此有所了解,请随时给出更好的答案;我不确定是什么原因导致finalizer
挂在那些对象上。
仅靠这一点并不能真正解决任何类型的问题,所以这就是我的解决方法:
首先,我将此代码添加到MainActivity
OnCreate
中。这会将GREF日志写入droid设备根目录/ download文件夹中的文件中,然后循环并每120秒更新一次(或您选择的任何间隔)
#if DEBUG
Task.Run(async () =>
{
const int seconds = 120;
const string grefTag = "monodroid-gref";
const string grefsFile = "grefs.txt";
while (true)
{
var appDir = Application.ApplicationInfo.DataDir;
var grefFile = System.IO.Path.Combine("/data/data",PackageName,"files/.__override__",grefsFile);
var grefFilePublic = System.IO.Path.Combine(Android.OS.Environment.ExternalStorageDirectory + Java.IO.File.Separator + "download",grefsFile);
if (System.IO.File.Exists(grefFile))
{
System.IO.File.Copy(grefFile,grefFilePublic,true);
System.Console.Write(grefTag,$"adb pull {grefFilePublic} {grefsFile}");
}
else
System.Console.Write(grefTag,"no grefs.txt found,gref logging enabled? (adb shell setprop debug.mono.log gref)");
await Task.Delay(seconds * 1000);
}
});
#endif
然后运行此命令,以在设备上启用gref记录
adb shell setprop debug.mono.log gref
然后我运行了该应用程序,然后放开了视频处理器,最终停滞了下来。之后,我从下载文件夹中收集了.txt文件,并使用Visual Studio Code进行了检查(因为它可以轻松处理大文件)
在我的情况下,有一个循环遍历,看起来像这样,重复了数千次:
take_weak_global_ref_jni
-g- grefc 25196 gwrefc 7 handle 0x6495a/G from thread 'finalizer'(27691)
take_weak_global_ref_jni
*take_weak obj=0x7c1046df60; handle=0x64106
+w+ grefc 25195 gwrefc 8 obj-handle 0x64106/G -> new-handle 0x953/W from thread 'finalizer'(27691)
take_weak_global_ref_jni
-g- grefc 25195 gwrefc 8 handle 0x64106/G from thread 'finalizer'(27691)
take_weak_global_ref_jni
*take_weak obj=0x7c19c4e630; handle=0x64fd6
+w+ grefc 25194 gwrefc 9 obj-handle 0x64fd6/G -> new-handle 0x963/W from thread 'finalizer'(27691)
take_weak_global_ref_jni
-g- grefc 25194 gwrefc 9 handle 0x64fd6/G from thread 'finalizer'(27691)
take_weak_global_ref_jni
*take_weak obj=0x7c1046df98; handle=0x63d9a
+w+ grefc 25193 gwrefc 10 obj-handle 0x63d9a/G -> new-handle 0x973/W from thread 'finalizer'(27691)
take_weak_global_ref_jni
-g- grefc 25193 gwrefc 10 handle 0x63d9a/G from thread 'finalizer'(27691)
take_weak_global_ref_jni
我认为这些是无法收集的卡住的内存地址。请注意handle=0x64fd6
所以我在.txt文件中搜索了该地址,这导致我想到了这一点:
+g+ grefc 25190 gwrefc 0 obj-handle 0x4eaba/I -> new-handle 0x64fd6/G from thread 'Thread Pool Worker'(8)
at Android.Runtime.AndroidObjectReferenceManager.CreateGlobalReference (Java.Interop.JniObjectReference value) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Java.Interop.JniObjectReference.NewGlobalRef () [0x00000] in <286213b9e14c442ba8d8d94cc9dbec8e>:0
at Android.Runtime.JNIEnv.NewGlobalRef (System.IntPtr jobject) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Java.Lang.Object.RegisterInstance (Android.Runtime.IJavaObject instance,System.IntPtr value,Android.Runtime.JniHandleOwnership transfer,System.IntPtr& handle) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Java.Lang.Object.SetHandle (System.IntPtr value,Android.Runtime.JniHandleOwnership transfer) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Java.Lang.Object..ctor (System.IntPtr handle,Android.Runtime.JniHandleOwnership transfer) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Javax.Microedition.Khronos.Egl.IEGL10Invoker..ctor (System.IntPtr handle,Android.Runtime.JniHandleOwnership transfer) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at System.Reflection.MonoCMethod.InternalInvoke (System.Reflection.MonoCMethod,System.Object,System.Object[],System.Exception& ) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Reflection.MonoCMethod.InternalInvoke (System.Object obj,System.Object[] parameters,System.Boolean wrapExceptions) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Reflection.MonoCMethod.DoInvoke (System.Object obj,System.Reflection.BindingFlags invokeAttr,System.Reflection.Binder binder,System.Globalization.CultureInfo culture) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Reflection.MonoCMethod.Invoke (System.Reflection.BindingFlags invokeAttr,System.Globalization.CultureInfo culture) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Reflection.ConstructorInfo.Invoke (System.Object[] parameters) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at Java.Interop.TypeManager.CreateProxy (System.Type type,System.IntPtr handle,Android.Runtime.JniHandleOwnership transfer) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Java.Interop.TypeManager.CreateInstance (System.IntPtr handle,System.Type targetType) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Java.Lang.Object.GetObject (System.IntPtr handle,System.Type type) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Java.Interop.JavaObjectExtensions._JavaCast[TResult] (Android.Runtime.IJavaObject instance) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Java.Interop.JavaObjectExtensions.JavaCast[TResult] (Android.Runtime.IJavaObject instance) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at Android.Runtime.Extensions.JavaCast[TResult] (Android.Runtime.IJavaObject instance) [0x00000] in <016ee5efc3d0460baf9a60f95885ebbb>:0
at MediaCodecHelper.OutputSurface.AwaitNewImage (System.Boolean returnOnFailure) [0x00079] in C:\repos\BitChute_Mobile_Android_BottomNav_newAppBak - Copy - Copy\VideoEncoding\OutputSurface.cs:298
at MediaCodecHelper.FileToMp4.EncodeFileToMp4 (System.String inputPath,System.String outputPath,System.Boolean encodeAudio,Android.Net.Uri inputUri) [0x00202] in C:\repos\BitChute_Mobile_Android_BottomNav_newAppBak - Copy - Copy\VideoEncoding\FileToMp4.cs:253
at MediaCodecHelper.FileToMp4.Start (Android.Net.Uri inputUri,System.String inputPath) [0x00007] in C:\repos\BitChute_Mobile_Android_BottomNav_newAppBak - Copy - Copy\VideoEncoding\FileToMp4.cs:181
at BitChute.Fragments.SettingsFrag+<>c.<StartEncoderTest>b__18_0 () [0x000fe] in C:\repos\BitChute_Mobile_Android_BottomNav_newAppBak - Copy - Copy\Fragments\SettingsFrag.cs:310
at System.Threading.Tasks.Task.InnerInvoke () [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Threading.Tasks.Task.Execute () [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Threading.Tasks.Task.ExecutionContextCallback (System.Object obj) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext executionContext,System.Threading.ContextCallback callback,System.Object state,System.Boolean preserveSyncCtx) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext,System.Boolean preserveSyncCtx) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Threading.Tasks.Task.ExecuteWithThreadLocal (System.Threading.Tasks.Task& currentTaskSlot) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Threading.Tasks.Task.ExecuteEntry (System.Boolean bPreventDoubleExecution) [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Threading.Tasks.Task.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Threading.ThreadPoolWorkQueue.Dispatch () [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback () [0x00000] in <d4a23bbd2f544c30a48c44dd622ce09f>:0
handle 0x64fd6; key_handle 0x259aa25: Java Type: `com/google/android/gles_jni/EGLImpl`; MCW type: `Javax.Microedition.Khronos.Egl.IEGL10Invoker`
通知Javax.Microedition.Khronos.Egl.IEGL10Invoker
和handle 0x64fd6
我建议检查GC陷入循环的部分中的内存句柄。然后搜索这些句柄,看看是否可以找到它们所引用的类型。在找到尝试GC的循环类型后,您将需要追溯源代码并找到在循环中调用(或实例化)该类型的位置。我认为通常是(根据我的阅读)循环会生成未处置的Java.Lang.Object
引用,从而导致GC失败。
所以我知道那时与接口IEGL10
有关。我回去尝试从循环中删除就可以了!现在GREF从未超过600。
快速步骤:
1. enable gref logging
2. run app
3. check logs for the memory addresses that are not being collected properly (where your app gets stuck in a GC loop,you'll likely see
a ton of repeated lines)
4. search for those memory addresses
5. check for the object **type** that is in the memory handle assignment stack trace
6. go back to your long running problematic loops and see if you can find a matching method being called or an object instantiation in rapid succession and not
being disposed of
7. either `Dispose` your looped objects manually or I also read to try and avoid `Java.Lang.Object` inheritance if it's in a long
running loop.
您可能不会像我这样幸运,因为我只能注释掉一行代码。您可能必须找出一种手动处理循环对象的方法,或者做其他事情以通知应用程序它可以安全地对那些对象进行GC,但是内存地址应为您提供线索,以了解哪个对象导致了问题。 GC。
很抱歉,这不是最好的答案,但是我对GC的工作方式不太熟悉。如果有人能提供更好的解释,我很想听到更多有关此的详细信息,但这就是我的解决方法!希望对您有帮助