• wheezy@lemmy.ml
    link
    fedilink
    arrow-up
    88
    arrow-down
    1
    ·
    edit-2
    17 days ago

    I know people joke around about this a lot. But I was blown away by a recent switch to I did from using cv2 with python to using cv2 with C++.

    I had literally hundreds of thousands of images to analyze for a dataset. My python script would have taken 12 hours.

    I ported it to C++ and it literally destroyed it in 20 minutes.

    I’m sure I was doing something that really wasn’t optimized well for python. I know somewhere in the backend it probably was using a completely different library with multi thread optimization. Or maybe turbojpg is just garbage in python. I’m still not even sure what the bottleneck was. I don’t know enough to really explain why.

    But holy shit. I never had that much of a performance difference in such a simple task.

    Was very impressed.

    • jsomae@lemmy.ml
      link
      fedilink
      arrow-up
      29
      ·
      17 days ago

      Exactly what this comic is saying. C++ can handle in 20 minutes what takes python 12 hours, but something gets destroyed.

      • wheezy@lemmy.ml
        link
        fedilink
        arrow-up
        16
        ·
        edit-2
        17 days ago

        Was a personal project. But absolutely. Half my job is trying to explain why something is taking so long when in reality I actually it’s already done I just don’t want to do nothing for the next few days.

        Managers never really know. And other engineers don’t care. It’s all about balancing expectations.

    • ChickenLadyLovesLife@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      17 days ago

      I ran into a similar situation many years ago, when I was trying to write a software synthesizer using Visual Basic (version 4 at the time). The big problem is that if you’re doing sample-by-sample processing of audio data in a loop (like doing pixel-by-pixel processing of images) and your chosen language’s compiler can’t compile to a native EXE or inline calls, then you end up suffering the performance hit of function calls that have to be made for each sample (or pixel). In many applications you’re not making a lot of function calls and the overall performance hit is negligible, but when you’re doing something where you’re making hundreds of thousands or even millions of calls per second, you’re screwed by the overhead of the function calls themselves - without there being any other sort of inefficiency going on.

      In my case, I eventually offloaded the heavy sample processing to a compiled DLL I wrote in C, and I was able to keep using Visual Basic for what it did really well, which was quickly building a reliable Windows GUI.

    • HiddenLayer555@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      16 days ago

      Might be latency? Python is probably slower to respond to things like when one job is done and the next job can start, since it needs time to interpret the code and call the relevant C libraries.