Offscreen slow with the DirectX version

During my navigation of the java3d sources I noticed the IMPRESSIVE (a nightmare) copyDataToSurfacexxx() function of D3dUtil.cpp used to copy from a direct3d surface to the offscreen buffer.

I suppose that the above function is the bottleneck of Offscreen rendering with Java3D/directx.
The OGL implementation uses readPixels() and therefore it’s probably faster due to the ogl implementation.

Why don’t reimplement it by using specific hardware-accelerated copy functions of DirectX ?

I found something interesting here: http://www.geocities.com/foetsch/d3d8screenshot/d3d8screenshot.htm

Mik