That makes complete sense - if you’ve got something ‘needy’, as soon as it’s queuing up, I imagine it snowballs, too…
10-20 times the core count is crazy, but I guess it’s had a lot of development effort into parallelizing it’s execution, which of course goes against what your use case is :)
Theoretically a load average could be as high as it likes, it’s essentially just the length of the task queue, after all.
Processes having to queue to get executed is no problem at all for lots of workloads. If you’re not running anything latency-sensitive, a huge load average isn’t a problem.
Also it’s not really a matter of parallelization. Like I mentioned, ffmpeg impacted other processes even when restricted to running in a single thread.
That’s because most other processes will do work in small chunks that complete within nanoseconds. Send a network request, parse some data, decode an image, poll HID device, etc.
A transcode meanwhile can easily have a CPU running full tilt for well over a second, working on just that one thing. Most processes will show up and go “I need X amount of CPU time” while ffmpeg will show up and go “give me all available CPU time” which is something the scheduler can’t actually quantify.
It’s like if someone showed up at a buffet and asked for all the food that no-one else is going to eat. How do you determine exactly how much that is, and thereby how much it is safe to give this person without giving away food someone else might’ve needed?
You don’t. Without CPU headroom it becomes very difficult for the task scheduler to maintain low system latency. It’ll do a pretty good job, but inevitably some CPU time that should have gone to other stuff, will go the process asking for as much as it can get.
That makes complete sense - if you’ve got something ‘needy’, as soon as it’s queuing up, I imagine it snowballs, too…
10-20 times the core count is crazy, but I guess it’s had a lot of development effort into parallelizing it’s execution, which of course goes against what your use case is :)
Theoretically a load average could be as high as it likes, it’s essentially just the length of the task queue, after all.
Processes having to queue to get executed is no problem at all for lots of workloads. If you’re not running anything latency-sensitive, a huge load average isn’t a problem.
Also it’s not really a matter of parallelization. Like I mentioned, ffmpeg impacted other processes even when restricted to running in a single thread.
That’s because most other processes will do work in small chunks that complete within nanoseconds. Send a network request, parse some data, decode an image, poll HID device, etc.
A transcode meanwhile can easily have a CPU running full tilt for well over a second, working on just that one thing. Most processes will show up and go “I need X amount of CPU time” while ffmpeg will show up and go “give me all available CPU time” which is something the scheduler can’t actually quantify.
It’s like if someone showed up at a buffet and asked for all the food that no-one else is going to eat. How do you determine exactly how much that is, and thereby how much it is safe to give this person without giving away food someone else might’ve needed?
You don’t. Without CPU headroom it becomes very difficult for the task scheduler to maintain low system latency. It’ll do a pretty good job, but inevitably some CPU time that should have gone to other stuff, will go the process asking for as much as it can get.