Performance

For a while I’ve been wondering about the performance of the EV3 and I’ve been thinking that since it is a 300 MHz ARM9 processor I figured the performance shouldn’t be an issue.

But after building a simple framework that draws one image using UI_DRAW BITMAP, and drawing 2 sprites that does UI_DRAW PIXEL for each pixel in the sprite I am already using up 90 of the 100 ms I have for each frame. This basically means that to achieve 10 fps I can’t do more than draw one background and two sprites.

So that is a bit disappointing!

There is a trick to copy the current frame buffer to a temporary buffer and then copy that buffer back to the frame buffer, and it is a cheap operation to do, so when I removed the UI_DRAW BITMAP that draws the background and only do that once, copy the drawn image to a buffer and then for each frame copy that buffer back to the frame buffer I get down to 59 ms per frame. So that’s a bit nicer.

Removing my own sprite drawing function but keeping the buffer copying thing the frame time is reduced from 59 ms to 0.3 ms, soooo .. that’s embarrassing. :)

If I keep all my code but remove the call to UI_DRAW PIXEL the frame time is 44 ms, so the actual drawing is somewhat expensive, but since we use quite a lot of frame time even without drawing then perhaps there is a chance to optimize this.

If I remove the reading of the pixel in the sprite I get down to a frame time of 21 ms. In these 21 ms it is all “my” stuff, meaning it is my inner loop in the sprite drawing, my algorithm to sort the game objects (bubble sort ftw), my overhead for managing game objects etc..

And if I remove the call to Sprite_Draw entirely I’m down to 1.5 ms. So the time is spent in that function.

So to summarize.

Full, old drawing: 90 ms
Drawing with the buffer trick: 59 ms
Drawing with the buffer trick but removing reading pixels from the sprite and writing them to the screen: 22 ms
Drawing with the buffer trick but removing the call to Sprite_Draw: 1.5 ms

So 57.5 ms is spent in this function (20.5 ms if we remove the calls to ARRAY_READ and UI_DRAW PIXEL):

subcall Sprite_Draw
{
  IN_16   SpriteDataHandle
  IN_16   ScreenX
  IN_16   ScreenY

  DATA8   SpriteWidth
  DATA8   SpriteHeight
  DATA32  ReadOfs
  DATA16  WriteX
  DATA16  WriteY
  DATA16  MaxX
  DATA16  MaxY
  DATA8   ReadPixel
  DATA8   WritePixel

  // Read sprite dimensions. Keeping READ_CONTENT since this code works.
  ARRAY( READ_CONTENT, -1, SpriteDataHandle, 0, 1, SpriteWidth )
  ARRAY( READ_CONTENT, -1, SpriteDataHandle, 1, 1, SpriteHeight )

  // Setup
  MOVE8_16( SpriteWidth, MaxX )
  ADD16( MaxX, ScreenX, MaxX )

  MOVE8_16( SpriteHeight, MaxY )
  ADD16( MaxY, ScreenY, MaxY )

  // The whole thing
  MOVE32_32( 2, ReadOfs )
  MOVE16_16( ScreenY, WriteY )

Loop_Y:
  MOVE16_16( ScreenX, WriteX )

Loop_X:
  ARRAY( READ_CONTENT, -1, SpriteDataHandle, ReadOfs, 1, ReadPixel )
  JR_EQ8( ReadPixel, 2, DoneDrawing )   // Don't do any drawing at all
  JR_EQ8( ReadPixel, 1, DrawBlack )
  MOVE8_8( BG_COLOR, WritePixel )       // Set WritePixel to White
  JR( Draw )

DrawBlack:
  MOVE8_8( FG_COLOR, WritePixel )       // Set WritePixel to Black

Draw:
  UI_DRAW( PIXEL, WritePixel, WriteX, WriteY )

DoneDrawing:
  ADD32( ReadOfs, 1, ReadOfs )
  ADD16( WriteX, 1, WriteX )
  JR_LT16( WriteX, MaxX, Loop_X )

  ADD16( WriteY, 1, WriteY )
  JR_LT16( WriteY, MaxY, Loop_Y )
}

Reducing compares in the inner loop so it looks like this:

Loop_Y:
  MOVE16_16( ScreenX, WriteX )

Loop_X:
  ARRAY( READ_CONTENT, -1, SpriteDataHandle, ReadOfs, 1, ReadPixel )
  JR_EQ8( ReadPixel, 2, DoneDrawing )   // Don't do any drawing at all
  UI_DRAW( PIXEL, ReadPixel, WriteX, WriteY )

DoneDrawing:
  ADD32( ReadOfs, 1, ReadOfs )
  ADD16( WriteX, 1, WriteX )
  JR_LT16( WriteX, MaxX, Loop_X )

  ADD16( WriteY, 1, WriteY )
  JR_LT16( WriteY, MaxY, Loop_Y )

will reduce the frame time to 52 ms, so a slight improvement, but still not good enough.

So I’ve gotten the frame time down from 90 ms to 52 ms and I’ll accept that performance for now, leave the optimization aside for a while and get on with the actualproject.

Leave a Reply