Wednesday 26 October 2011

GC, finalisers

So I was doing some memory profiling the other day (using netbeans excellent excellent profiler - boy I could've used this 10 years ago) to try to track down some resource leakages and I noticed that xuggle was really exercising the system heavily.

So it seems I might look at moving to use jjmpeg in my client's application fairly soon. There are some other reasons as well: i.e. not being able to run in a 64-bit JVM on microsoft windows is starting to become a problem, and the bundled ffmpeg is just a bit out of date.

Since I haven't implemented memory handling completely in jjmpeg I went about looking how to do it 'properly'. I was just going to try to use finalisers, but then I came across this article on java finalisers java finalisers which said it probably wasn't a good idea.

I was going to have a short look this morning but suddenly it was 4 hours later and although I had something which works i'm not sure yet that I like it. It seems the cleanest way to implement the suggestions of using weak references, and mixing the auto-generated and hand-crafted code I want, so I will probably end up running with it. The public api didn't need to change.

Previously, the binding worked with an object class hierarchy something like this
 AVNative [
ByteBuffer p (points to allocated/mapped native memory)
]
+- AVFormatContextAbstract [
Generated field accessors and native methods
Most methods are object methods
]
+- AVFormatContext [
Public factory methods/constructors
Hand-coded specific methods
Hand-coded helper native methods
Hand-coded finalise/dispose methods
]

The new structure:
WeakReference<AVObject>
+- AVNative [
ByteBuffer p pointing to native memory
internal dispose() method
weak reference queue/cleanup as from article above
Weak reference is AVObject
]
+- AVFormatContextNativeAbstract [
Generated field accessors and native methods
All methods and field accessors are static
]
+- AVFormatContextNative [
Hand-coded helper native methods
Implements native resource dispose
]

Together with
AVObject [
AVNative n (the pointer to the native wrapper object)
public dispose method
]
+- AVFormatContextAbstract [
Generated public access methods which use AVFormatContextNative(Abstract) methods.
]
+- AVFormaContext [
Public factory methods/constructors
Hand-coded specific methods
]

So yeah - a bit more complicated, and it requires 2 objects for each instance (and often 3 including the C side instance it's wrapping), as well as the overhead of the weakreference instance data and the list entry for tracking the references. The extra layer of indirection also adds another method invocation/stack frame to every method call.

On the other hand, it lets the client code use dispose() when it wants to, or if it forgets then dispose will automatically be called eventually. And makes it obvious in the code where dispose needs to sit.

As usual it's a question of trade-offs. If the article is correct then presumably these trade-offs are worth it.

In this case the whole point of using jjmpeg is to avoid numerous allocations every frame anyway: I can allocate working and output buffers once and just use them directly. In this case the actual number of objects is quite small and doesn't happen very often, so I suspect that either mechanism would work about as well as the other.

Well this distraction has blown my morning away; I'd better leave it for now so I can clock up some work hours after lunch.

Update I figured i'd gone too far down this route to do anything other than keep it. I've checked this in now as well as a bunch of other stuff described on the project page. Update 2: Oracle keeps breaking links, but i've updated the pointer. I'm looking at this again (September 2012) because of some issues in jjmpeg.

2 comments:

mbien said...

the problem with this is that you are correlating the heapsize + gc alg with native resources. It is not guaranteed that a GC will ever free an unused object. All current GCs are lazy for example.

What makes it even worse is that a rather small object can represent a rather large native resource. In jocl for example the native resource don't have to be on the same physical device. Thats why I didn't implement automatic cleanup since it would only claim safety. (In worst case it would only delay out of native resources scenarios)

NotZed said...

Yeah i know there are some problems with it. I just dont want to end up with a refcounted api and completely explicit management.

I've done some testing and it seems to work ok enough for me even if i leave it to clean up automatically.

But i'm sure you know: one lives and learns ... so i could be completely wrong.