Some more iPhone benchmarks and tests with everybody’s favorite difficult to pronounce 3d rendering framework!
Thumb mode and floating point: Just don’t
Ok, first, I was an idiot before and somehow was still compiling in thumb mode. I know not to do that!
Somehow the setting was removed but I guess the default is ON even with it missing or something. After adding it by hand it the math-heavy md2 anim rendering test went from 31 fps to 60+ fps.
I think the mistake came from editing the build configuration by clicking on the project settings instead of the “build target” settings, so I’m going to check both from now on to be safe.
Irrlicht and Quake 3 style .bsp demo
I was expecting Irrlicht to do some kind of wicked optimizations when rendering a .bsp map. Nope, it does nothing special. You CAN load it into an octree but if OCTTREE_USE_HARDWARE is defined it renders the whole mesh anyway. (?!!?) If that flag isn’t defined, it does render less primitives but without using VBOs so it’s way slower anyway.
This forum thread may be of interest about that.
Collision detection would be a better candidate for a software octree but after playing with createOctTreeTriangleSelector with different settings I never saw even one fps of difference. Strange, I would have expected it to noticeably help.
(this section was updated Dec 2nd after Tonic pointed out that you can use VBO’s with Octrees…)
Adding iPhone controls
As you can see in the screenshot above I’ve added a basic wolf3d control scheme to get around.
I use my own engine to handle the GUI overlay/handling (it draws after Irrlicht is done) and just have it controlling Irrlicht’s FPSControlComponent.
I use camera->setTarget() to look up and down, and I fake UP/DOWN/LEFT/RIGHT key movement by sending pDevice->postEventFromUser() messages. This works ok but bites because it’s not proportional or 360 degree so I’ll need to modify FPSControlComponent or just do my own movement stuff if I want to do it right.
Do using compressed textures help the FPS on the iPhone?
Short answer, a little.
I added CImageLoaderRTTEX to Irrlicht so it could load my own texture format. It’s sort of a container that houses multiple formats including pvrct4, pvrct2, 4444 rgba, 565 rgb, etc. It does extra stuff that normal .pvr’s don’t, like remember the original image size before padding or stretching.
I also modified the b3d and .bsp loader to look for textures with my .rttex file extension first.
Some rough numbers with the quake style .bsp map collision disabled:
- Map test with raw 32 bit textures with mipmap chain: 26 fps
- Map test with pvrtc4 format textures with mipmap chain: 28 fps
So yeah, tiny difference.
However, keep in mind there are also other good reasons to use pvrtc:
- Fast loading even when zlib’ed. Especially compared to decompressing a .jpg
- Use a hell of a lot less texture memory
On the down side, the visual artifacts can look pretty bad so you still need some raw formats in your toolbox for specific images like GUI.
Do using mipmaps help speed on the iPhone?
I didn’t notice a difference. Using mipmaps look better though, although I need to adjust the lod bias a bit so it doesn’t pop up so bad…
Tip: The iPhone requires a full mipmap chain to work, so don’t try to get tricky and only include a few of them.
For your enjoyment here is what happens when your texture processing utility has a bug in its mipmap generation. (The festive blue, pink, and green colors shouldn’t be there…)
Putting it all together, our own level
I’ve got to say, the king of the 3dsmax exporters for Irrlicht is B3D Pipeline.
I know, you’re thinking “Uh… b3d, that Blitz3d format? Why not use collada or .x or something?” All I can say is this is the exporter that actually worked right for me when trying to get the lightmaps working.
So here is a low poly house in 3dsmax. I’ve applied a light source and got it looking how I want. Then I use max’s render to texture feature to make a single alpha lightmap.
The (old) iPhone has two texture units and this will be the second one, controlling where shadows appear, the same way the .bsp map example works.
After exporting it pops into the game engine fully lightmapped and ready to go with a single line of code.
After adding a skybox and a simplified collision mesh it still gets nearly 60 fps, not bad. Too bad a game needs more than one house. Hmm, my fov looks a bit extreme.
Even after you found out the octtree-use-hardware thing, it still puzzles me if it would be better or not with rendering only parts, or if the VBOs are any help anyway, if you’re testing on a 1st gen device…
The hardware is a tile renderer after all, so it should be quite effective at culling away extra overdraw even if you are drawing a bit extra. This is of course assuming the total number of primitives still wouldn’t be that big (10k+ is definitely pushing it, so no wonder it is a bit slow).
But more about the VBOs… apparently at least on the 1st gen devices (with gles 1.1) the drivers are weird enough so that they always just make a copy of the data, so using VBOs probably doesn’t give any speed boost. More info about from this presentation by Rej from Assembly09 event:
http://blogs.unity3d.com/2009/08/20/assembly-2009-presentations/
(the second video)
If you have some contradictory findings about VBO performance in some cases, I’d of course be really interested in hearing any results… And of course using them on a 3GS might be a totally different thing regarding performance behavior.