Slow with shader

Arohas

New member
Joined
Feb 14, 2021
Messages
27
Slow work with shader, I wrote an example demonstrating the problem. I wrote based on one of the examples mojo2. I can't figure out where the error is, please help optimize the program.

The target is the desktop. I tested in linux and windows.

I tried other engines, for example AGK2 similar code renders 23000 bubbles at 30 FPS on i9-9900, in linux and 14000 in windows, RayLib - pure с - more than 60000... I expected a similar performance from Cerberus, i can't figure out what I'm doing wrong.
code in attachment.
 

Attachments

  • cerberus.zip
    45.6 KB · Views: 18
Do you have the same switch to enable or disable the shader in your test codes on the other engines? It would be interesting to see both values as a reference for simple texture rendering.

I had a quick look at your code and I can see that there is a big difference between your shader enabled versus disabled code. With your "disabled" shader you don't change textures at all therefore there might be only one gl draw call per frame. With your shader you are drawing to an image once for each bubble and that means you have one draw call per bubble. This alone could be the reason for a huge drop in fps. If you don't need to have completely independent bubbles (just different in the shader input values) we should look for a way to do it without changing textures. I have to investigate when I have some time.
 
Thank.
Right now I don't have the opportunity to check on my home computer, I checked on a laptop i5-1240p iris graphics. AGK with shader 14000 - 30 fps, without shader 17000 - 30 fps.The program and shader code for AGK is slightly different.
 

Attachments

  • BB_AGK.ZIP
    2.6 MB · Views: 20
I see the problem, you got the shader code okay and cerberus-x code okay. You just did one simple mistake to erbind shader on every bubbble instead of before the loop.

I alos took away one Flush and you can probably optimize this further if you need, but it's good now on all platforms.

Code:
Strict
Import mojo2
#GLFW_WINDOW_WIDTH=1536
#GLFW_WINDOW_HEIGHT=1024



Global BubbleMax:int = 65536
Global A_Bubbles:TBubble[]
Global L_Bubbles:= New List < TBubble>
Global BubblesCount:int
Global BubbleSprite:int
Global ScrW:int = 1536
Global ScrH:int = 1024

Function Main:Int()
    New MyApp()
    Return 0
End

Function UpdateBubble:Void(Abub:TBubble)
    Local LhalfSize:= int(TBubble.imagesize * Abub.size / 2) + 2
    
    If Abub.pos.x > ScrW - LhalfSize or Abub.pos.x < LhalfSize
        Abub.speed.x = 0 - Abub.speed.x
        Abub.life = Abub.life - 1
    EndIf
    If Abub.pos.y > ScrH - LhalfSize or Abub.pos.y < 1 + LhalfSize
        Abub.speed.y = 0 - Abub.speed.y
        Abub.life = Abub.life - 1
    EndIf
    
    Abub.pos.x = Abub.pos.x + Abub.speed.x
    Abub.pos.y = Abub.pos.y + Abub.speed.y
    
    Abub.iridescent = Abub.iridescent + 0.03
    If Abub.iridescent > 0.9 Then Abub.iridescent = 1 - Abub.iridescent' ???
    
    Abub.animData.x = Abub.animData.x + Abub.animSpeed
    Abub.animData.y = Abub.animData.y + Abub.animSpeed
End

Class MyApp Extends App

    Field BubbleImage:Image
    Field BubbleSprite:Image
    Field myCanvas:Canvas
    Field effect:ShaderEffect
    
    Field level:Float = 0.5
    Field ticks:float
    
    Field frames:int
    Field fps:int
    Field time:int
    Field ShaderEnable:bool

    
    Method OnCreate:Int()
        BubbleImage = Image.Load("bubble.png")
        TBubble.imagesize = BubbleImage.Width()
        
        BubbleSprite = New Image(BubbleImage.Width(), BubbleImage.Height())
        effect = New ShaderEffect()
        effect.InitTexture(BubbleImage)
        myCanvas = New Canvas()
        Return 0
    End
    
    Method OnUpdate:Int()
        Local bub:TBubble
        If MouseDown(MOUSE_LEFT)
            For Local i:= 0 To 9
                bub = New TBubble
                bub.pos.x = MouseX()
                bub.pos.y = MouseY()
                L_Bubbles.AddLast(bub)
             Next
        EndIf
        If KeyHit(KEY_SPACE)
            ShaderEnable = not (ShaderEnable)
         EndIf
        Return 0
    End
    
    Method OnRender:Int()
    Local delta1:int
    Local le:int
    Local Ss:string
        frames += 1
        le = Millisecs() -time
        If le >= 1000
            fps = frames
            frames = 0
            time += le
        EndIf

        myCanvas.Clear(0.2, 0.2, 0.2)
        le = Millisecs()
         effect.Render(BubbleSprite)
        For Local Bubble:= EachIn L_Bubbles
            If ShaderEnable
            
                effect.SetLevel(Bubble.iridescent, Bubble.cr, Bubble.cg, Bubble.cb)
                
                myCanvas.DrawImage(BubbleSprite, Bubble.pos.x, Bubble.pos.y,
                    Bubble.animSpeed * Sin(Bubble.animData.x),
                    Bubble.size - (Sin(Bubble.animData.x) / 20),
                    Bubble.size - (Cos(Bubble.animData.y) / 20))
                
                ' myCanvas.Flush()
                Ss = "enable."
            Else
                myCanvas.DrawImage(BubbleImage, Bubble.pos.x, Bubble.pos.y,
                    Bubble.animSpeed * Sin(Bubble.animData.x),
                    Bubble.size - (Sin(Bubble.animData.x) / 20),
                    Bubble.size - (Cos(Bubble.animData.y) / 20))
                Ss = "disable."
            EndIf
            UpdateBubble(Bubble)
            If Bubble.life < 1
                L_Bubbles.Remove(Bubble)
            EndIf
        Next
        delta1 = Millisecs() -le
        
        myCanvas.DrawText("Clic LMB to generate bubbles, press SPACE to enable/disable shader ", 10, 10)
        myCanvas.DrawText("Bubbles: " + String(L_Bubbles.Count()), 10, 26)
        myCanvas.DrawText("Shader: " + Ss + "    FPS: " + String(fps), 10, 42)
        myCanvas.DrawText("Draw ms: " + String(delta1), 10, 58)
        myCanvas.Flush()
        Return 0
    End
End'MyApp

Class float2d
    Field x:float
    Field y:float
End'float2d

Class TBubble
    Global imagesize:int
    Field pos:float2d
    Field speed:float2d
    Field animData:float2d
    Field animSpeed:float
    Field life:int
    Field iridescent:float
    Field size:float
    Field cr:float
    Field cg:float
    Field cb:float
    
    Method New()
        pos = New float2d
        speed = New float2d
        Local angle:= Rnd(1, 360)
        Local spd:= Rnd(0.5, 4)
        speed.x = Cos(angle) * spd
        speed.y = Sin(angle) * spd

        animData = New float2d
        animData.x = Rnd(-2, 2)
        animData.y = Rnd(-2, 2)
        animSpeed = Rnd(2, 6)
        
        size = Rnd(0.2, 1.0)
        life = int(Rnd(10, 50))
        iridescent = Rnd(0.0, 1.0)

        Select int(Rnd(1, 3.4))
            Case 1
                cr = 0.0
                cg = 0.1
                cb = 0.5
            Case 2
                cr = 0.0
                cg = 0.5
                cb = 0.0
            Case 3
                cr = 0.3
                cg = 0.4
                cb = 0.0
        End
    End
    
End'TBubble

Class BWShader Extends Shader
    Private
    Global _instance:BWShader
    
    Method New()
        Build(LoadString("shader.glsl"))
    End
    
    Method OnInitMaterial:Void( myMaterial:Material )
        myMaterial.SetTexture("ColorTexture", Texture.White())
        myMaterial.SetScalar("offset", 0)
        myMaterial.SetScalar("cr", 0.0)
        myMaterial.SetScalar("cg", 0.0)
        myMaterial.SetScalar("cb", 0.0)
    End
    
    Function Instance:BWShader()
        If Not _instance _instance = New BWShader()
        Return _instance
    End
        
End'BWShader

'========================================================
Class ShaderEffect

    Private
    Global _canvas:Canvas
    Field _material:Material

    Method New()
        ' ensure there is a single instance of the canvas
        If Not _canvas _canvas = New Canvas()
        _material = New Material(BWShader.Instance())
    End
    
    Method SetLevel:Void(level:Float, c1:float, c2:float, c3:float)
        ' set the level of effect to the supplied value
        
        _material.SetScalar("offset", level)
        _material.SetScalar("cr", c1)
        _material.SetScalar("cg", c2)
        _material.SetScalar("cb", c3)
    End
    
    Method InitTexture:Void(source:Image)
        _material.SetTexture("ColorTexture", source.Material.ColorTexture)
    End
        
    Method Render:Void(target:Image)
        _canvas.SetRenderTarget(target)
        _canvas.SetViewport(0, 0, target.Width(), target.Height())
        _canvas.SetProjection2d(0, target.Width(), 0, target.Height())
        _canvas.Clear(0.0, 0.0, 0.0, 0.0)
        _canvas.DrawRect(0, 0, target.Width(), target.Height(), _material)
        _canvas.Flush()
    End
    
End'ShaderEffect
 
Last edited:
As Phil7 Said the buffer is not always needed. You can often render the sprite directly using a shader like this, sorry that I don't have time to provide complete demo right now. Depending on the scenario you might want to keep the buffer. But it's good to know there's a choice.

Code:
src = Image.Load("graphics.png",0,0,0) ; dst = New Image(src.Width,src.Height,0,0) ; fx = New Shaderfx
 
@Wingnut Maybe I don't understand fully how the example is intended to look like, but I am pretty sure @Arohas wants to have a different look for each bubble which would need a effect.Render() for each Bubble. Otherwise the effect.SetLevel() call is not applied.
I did some tests and I think I am one little step closer to a solution: I set one material per Bubble that is reusing the same shader but enables different uniforms for each Bubble. Then I drew the image with DrawRect() using that specific material. For that I had do change the shader code a tiny bit to use the correct coordinates. The code is quite a mess at the moment. I'll post it as soon as I have cleaned it up.
performance wise I get on the same level as the agk exe (6000) you posted. Around 8000 until it goes under 30 fps.

@Arohas The main problem I still see is that I am still drawing once per Bubble which happens because we change Material for each Bubble.
At the moment we don't have a way to add attributes to shaders AFAIK so the I could abuse the color value from SetColor to get the parameters into the shader. Maybe I can test it later today, but it could help to improve performance a lot and get near the shader-disabled version. BTW what fps do you get with the shader-disabled code on Cerberus?
 
Last edited:
Look at this code, this will be an exreme slowdown see. Cerberus need to handle parameters into shaders but it demand a full update for each object, this is extremely slow. This is not how it should be.

I get 60 fps with 3000 Bubbles on all platforms without a shader.
If you use the shader globally it does not change the fps at all, but as soon as you update the parameters per bubble it slows down by a factor of x200.
 

Attachments

  • cerberus.zip
    152.3 KB · Views: 14
Last edited:
Ok, this one is what I was talking about before. Using one Material per Bubble and updating Uniforms once per Bubble on each frame. On my Windows 10 machine it is about 4 times slower than the shader-disabled version, but to me it looks much better than the original one.
There is still a quirk with the Bubbles not exactly in the right place and I am not sure if it is working properly memory wise. I just tried to find a simple way to show the result without doing any unnecessary costly things in between.

The hack with using the color values is not implemented here!
 

Attachments

  • cerberus_phil1.zip
    45.9 KB · Views: 21
Phil7, Thanks, your code is much faster. What is surprising, however, is that rendering on Cerberus is slower than on AGK, It seems to me that this is a problem with mojo2, maybe I will somehow gather my courage and do a test on pure OpenGL on Cerberus, but later.
Here are the promised measurements, I used your CX code, because mine is ugly slow:
CX w shader: 5250
CX no shader 20845
AGK w shader 21558
AGK no shader 27130
System: linux Mint i9-9900 64gb ram. gtx-1660ti "nouveau" open source driver.
 

Attachments

  • Cx_w shader.png
    Cx_w shader.png
    205.3 KB · Views: 17
  • CX_no shader.png
    CX_no shader.png
    320.9 KB · Views: 19
  • AGK w shader.png
    AGK w shader.png
    129.9 KB · Views: 17
  • AGK no shader.png
    AGK no shader.png
    157.2 KB · Views: 16
if you have only few palettes then you could batch them into groups and you will around 20k on Cerberus-X, but of course if you need more than 50-100 palettes it would not work as good. Morever I got a similiar amount of slowdown when I compared Cxs to LibGDX (x 4) before so I think there might be something that needs to get fixed.
 
I used to write purely computational benchmarks and programs, and I've always been completely satisfied with the performance of Cerberus for Windows and Android. However, I've never dealt with shaders before. It was just an experiment, the main idea was to write one shader for everything, which, depending on the parameters of the object passed on the stage, draws it in a certain way. (Sorry, my English is Google Translate, I hope you understand what I'm talking about.) In principle, the solution proposed by Phil7 is clear to me and suits me. However, I didn't understand what you mean by palettes and palette groups.
 
My experiecnce is that Cebrerus is certinaly one of the better modern langauges when you need good performance. Shaders and Andoird have given me slight problems at times but mot of the time it seems to give you more power than you need.

Let me explain the grouping idea I had. Let's say that you have a few different colors that you want to assign your bubbles.
You could set one color for the shader (this being the only slow part) and you make sure that you draw all of that color inside a loop, then you set the shader to the next color and do the same, you draw all bubbles with that color in a loop.

This makes it essentially as fast as the non-shader as long as you don't have too many different groups.

Cerberus-X should be able to do this without this trick, but I don't have insight in the inner working sadly but I agree Phil7's solution is amazing I need to look into that one.
 
I understand now, but this only seems to apply in the simplest case. Even this small shader, in addition to color, also draws spots on the surface of the bubble: The algorithm is simple, but there is a lot of math: the color space of the image is converted to HSV, rotated and converted back to RGB. The shift position is individual for each bubble. Although the code is not yet finalized, and I haven't even gotten to dynamic lighting and glare yet! Shaders are definitely not mojo2's forte.
 
@Phil7 Isn't Mojo2 compiling all shaders at each frame?

@Arohas: Regarding AGK being faster, are you using Studio with Vulkan?
 
I understand now, but this only seems to apply in the simplest case. Even this small shader, in addition to color, also draws spots on the surface of the bubble: The algorithm is simple, but there is a lot of math: the color space of the image is converted to HSV, rotated and converted back to RGB. The shift position is individual for each bubble. Although the code is not yet finalized, and I haven't even gotten to dynamic lighting and glare yet! Shaders are definitely not mojo2's forte.
I know it's kind of cheating I agree and it should not be needed really to do what I suggsted. The shader works surprisingly well so I don't think it slows it down at all much, but I have to confess I don't know how it works in hte background maybe the shader compiles with the parameters somehow every time and a complex glsl = slow, but I doubt that's the case.

Everything is speedy when I measure Cerberus shader, except that you have to run the "Render" part of the shader class and then flush it. Also if you do several flushes it only takes the last one all the others gets the same parameters it seems.

In my perspective there are only two options, either the shader class need to be changed becuase I don't think too many has actually used it so it needs some loving. The other option is that Mojo2 might have a bug or well-needed optization I guess is a better term.
 
@Arohas I saw in one of your pictures that you build with debug mode. Release mode should be substantially faster.
@MikeHart Regarding shader compiling. If you only switch material and the shader stays the same it skips binding the glProgramm part AFAIK. shader compiling seems not to happen if the shaders are not modified.
 
Last edited:
@Arohas Here is the hacky version with using the color attribute to set the shader parameters. On my machine I get about 50000 at 30 fps compared to 8000 with the agk exe.
We might think about changing some things in mojo2 to support something like this in addition to the color values.
 

Attachments

  • cerberus_phil2.zip
    46 KB · Views: 24
@Arohas Here is the hacky version with using the color attribute to set the shader parameters. On my machine I get about 50000 at 30 fps compared to 8000 with the agk exe.
We might think about changing some things in mojo2 to support something like this in addition to the color values.
That works great, can you talk more about this "trick". I see that the bubbles stopped shimmering now though can't you use the old shader with this Cerberus-x, how sis the sahder special?

I can further help optimizeing the glsl. I see no much improvement actually but it is theoreticlly a good thing to read as few times as possible so I propose to do this.

Code:
uniform sampler2D ColorTexture;

void shader(){

    float offset = b3d_Color.a;
    float cr = b3d_Color.r;
    float cg = b3d_Color.g;
    float cb = b3d_Color.b;
   
    vec2 texcoords=b3d_Texcoord0;
     //convert clip position to valid tex coords
    //vec2 texcoords=(b3d_ClipPosition.st/b3d_ClipPosition.w)*0.5+0.5;

    //read source color
    //vec3 color1=texture2D( ColorTexture,texcoords ).rgb;
    //float alfa=texture2D( ColorTexture,texcoords ).a;

    // Read source color once from texture an extraxt
    vec4 color = texture2D(ColorTexture,texcoords).rgba;
    vec3 color1 = color.rgb;
    float alfa = color.a;

    //convert to hsv
    vec4 K = vec4(0.0, -1.0/3.0, 2.0/3.0, -1.0);
    vec3 c = color1;
    vec4 p = mix(vec4(c.bg, K.wz), vec4(c.gb, K.xy), step(c.b, c.g));
    vec4 q = mix(vec4(p.xyw, c.r), vec4(c.r, p.yzx), step(p.x, c.r));
    float d = q.x - min(q.w, q.y);
    float e = 1.0e-10;
        vec3 color2  = vec3(abs(q.z + (q.w - q.y) / (6.0 * d + e)), d / (q.x + e), q.x);

    //offset color
    color2.g = color2.g + 0.2 + offset/50.0;
    color2.r = color2.r + offset;

    //convert to RGB
    c = color2;
    K = vec4(1.0, 2.0 / 3.0, 1.0 / 3.0, 3.0);
    float r = abs(fract(c.r+ 1.0)*6.0 - 3.0);
    float g = abs(fract(c.r+ 2.0 / 3.0)*6.0 - 3.0);
    float b = abs(fract(c.r+ 1.0 / 3.0)*6.0 - 3.0);

    r = clamp(r,0.01,0.99);
    g = clamp(g,0.01,0.99);
    b = clamp(b,0.01,0.99);

    float r2 = c.b * mix(1.0, r, c.y) ;
    float g2 = c.b * mix(1.0, g, c.y) ;
    float b2 = c.b * mix(1.0, b, c.y) ;
   
    vec3 color4;
        color4 = vec3( (r2+cr)*alfa, (g2+cg)*alfa, (b2+cb)*alfa);

    //write output
    b3d_FragColor= vec4(color4.rgb,alfa);
    //b3d_FragColor = b3d_Color;
}

Also i saw there was a difference in how the positions of the bubbles inside the render when drawn using matrix versus no matrix so I fixed that, it is a bit clumpsy but It's a start.

Code:
    Method OnRender:Int()
    Local delta1:int
    Local le:int
    Local Ss:string
        frames += 1
        le = Millisecs() -time
        If le >= 1000
            fps = frames
            frames = 0
            time += le
        EndIf

        myCanvas.Clear(0.2, 0.2, 0.2)
        le = Millisecs()
        For Local Bubble:= EachIn L_Bubbles
            If ShaderEnable
'                Bubble.material.SetScalar("offset", Bubble.iridescent)
'                Bubble.material.SetScalar("cr", Bubble.cr)
'                Bubble.material.SetScalar("cg", Bubble.cg)
'                Bubble.material.SetScalar("cb", Bubble.cb)
                Local curColor:Float[4]
                myCanvas.GetColor(curColor)
                'Print "iri: " + Bubble.iridescent
                myCanvas.SetColor(Bubble.cr, Bubble.cg, Bubble.cb,Bubble.iridescent)
                
            '    myCanvas.PushMatrix()
            '    myCanvas.TranslateRotateScale(Bubble.pos.x, Bubble.pos.y,
            '        Bubble.animSpeed * Sin(Bubble.animData.x),
            '        (Bubble.size - (Sin(Bubble.animData.x) / 20)),
            '        (Bubble.size - (Cos(Bubble.animData.y) / 20)))               
            '    myCanvas.DrawRect(0,0, BubbleImage.Width, BubbleImage.Height, Bubble.material)
            '    myCanvas.PopMatrix()
        
        
             myCanvas.PushMatrix()
            ' Calculate the translation to center the bubble on the mouse pointer
            Local centerX: Float = BubbleImage.Width / 2
            Local centerY: Float = BubbleImage.Height / 2
            Local translationX: Float = Bubble.pos.x '- centerX
            Local translationY: Float = Bubble.pos.y '- centerY
            
            myCanvas.TranslateRotateScale(translationX, translationY,
                Bubble.animSpeed * Sin(Bubble.animData.x),
                (Bubble.size - (Sin(Bubble.animData.x) / 20)),
                (Bubble.size - (Cos(Bubble.animData.y) / 20)))
            
            ' Offset for the DrawRect to make sure it's drawn centered
            Local offsetWidth: Float = -BubbleImage.Width / 2
            Local offsetHeight: Float = -BubbleImage.Height / 2
            
            myCanvas.DrawRect(offsetWidth, offsetHeight, BubbleImage.Width, BubbleImage.Height,  Bubble.material)
            myCanvas.PopMatrix()
    
                myCanvas.SetColor(curColor[0], curColor[1], curColor[2], curColor[3])
                Ss = "enable."
            Else
            
        
            ' middle bubbles
                myCanvas.DrawImage(BubbleImage, Bubble.pos.x, Bubble.pos.y,
                    Bubble.animSpeed * Sin(Bubble.animData.x),
                    Bubble.size - (Sin(Bubble.animData.x) / 20),
                    Bubble.size - (Cos(Bubble.animData.y) / 20))
                    
          
                
 


                
                Ss = "disable."
            EndIf
            UpdateBubble(Bubble)
            If Bubble.life < 1
                L_Bubbles.Remove(Bubble)
            EndIf
        Next
        delta1 = Millisecs() -le
'        Print renderCnt
'        renderCnt=0
        
        myCanvas.DrawText("Clic LMB to generate bubbles, press SPACE to enable/disable shader ", 10, 10)
        myCanvas.DrawText("Bubbles: " + String(L_Bubbles.Count()), 10, 26)
        myCanvas.DrawText("Shader: " + Ss + "    FPS: " + String(fps), 10, 42)
        myCanvas.DrawText("Draw ms: " + String(delta1), 10, 58)
        myCanvas.Flush()
        Return 0
    End
 
Last edited:
I'll try to get the same shader working to show the shimmering correctly, but it shouldn't be gone completely as it is now. The reason for the change is that in glsl there is something to consider when reaching 0.0 I think.
 
Back
Top Bottom