Parallel Multithreaded Processing with CURL Lamp or similar V 15

Dan - Tue, 2020/03/24 - 17:35

I have written a PHP task/job scheduler... it fires off jobs which are PHP scripts that perform various tasks.

Currently its working pretty well, but every now and again I find that some of the jobs are "stuck" as another is currently running holding them up.

I had attempted to not allow this to happen via using the following to launch these jobs... this is from the terminal I am testing this and I use the PHP "exec()" command to fire within my code.

bash -c "exec nohup setsid curl -s 'http://localhost/MYJOBHERE_1.PHP' > /dev/null 2>&1 &"

When I have multiple to run I am just sending the same as above but adding another ampersand --> & to the beginning of the string so it becomes one long command like so:

bash -c "exec nohup setsid curl -s 'http://localhost/MYJOBHERE_1.PHP' > /dev/null 2>&1 &" & bash -c "exec nohup setsid curl -s 'http://localhost/MYJOBHERE_2.PHP' > /dev/null 2>&1 &" & bash -c "exec nohup setsid curl -s 'http://localhost/MYJOBHERE_3.PHP' > /dev/null 2>&1 &"

Now if I send like a bunch of these on the terminal that call scripts that write to a DB table so I can watch the time stamps... it seems like they all run, but at very different times... is this the operating system deciding to prioritize them or what? Is there any way to make sure they all run at the same time without causing other processes to slow them down?

The biggest issue is that some tasks are really of higher priority than others... is there a way for me to assign a priority to those tasks using the method I have?

Thanks for any an all input.

Forum:

Tags:

Add new comment

Maybe you know more than me, but on face value, that looks messy

Jeremy Davis - Wed, 2020/03/25 - 09:34

I've just re-read your post, and I think that I missed your point completely and went on a ramble... So unfortunately, I don't think that my long winded post below really addresses your question at all, but perhaps it still has some value (for someone else, if not you)...

On reflection, it seems that you are essentially trying to troubleshoot your PHP code, via calling it from bash. So whilst this doesn't really answer your question at all, seeing as I've written it, I'll post it anyway...

AFAIK threads should all have the same priority (often known as a "nice" value). You can read this value and change it even, but I'm not sure if that's really what you want to be doing...

IMO the way to go would be for the individual jobs to be handled within your PHP app. That way you can programatically ensure that the higher priority ones run first, whether or not lower priority ones get abandoned if they haven't finished when the high priority ones need to run again, and so on. Plus you can also have some jobs relying on the successful completion of others.

Not sure if any of this is useful to you or not though...

I preface with "Maybe you know more than me" as whilst I'm fairly experienced with bash, I wouldn't for a second suggest that I'm a master of it's dark ways! :) Also, I'm a complete PHP newb, so perhaps I'm totally misunderstanding something?!

So perhaps you know exactly what you're doing and I'm just not as experienced with bash as I'd like to think. But on face value, what you're doing there seems totally redundant on multiple levels... Below I'll walk through my understanding of why I say that. Apologies if I'm teaching you to suck eggs. Please feel free to explain why I misunderstand if that's the case... Here's my reading of what you are doing:

bash -c "..."

By default TurnKey's shell is already bash. FWIW, to double check the shell, you can "echo $SHELL". So unless you have changed the default, or are triggering this from some other shell/environment, this is redundant.

exec ...

TBH, I'm never explicitly used exec, I get that it's a thing and I've seen it used, but unless you have some really specific requirements for the bash output of your command (which doesn't seem to be the case as you're sending everything to /dev/null), I'm not sure why you'd be using it here. FWIW, here what the man page says on the matter (it's a bash built-in so it's buried in bash's man page):

exec [-cl] [-a name] [command [arguments]]
If command is specified, it replaces the shell. No new process is created. The arguments
become the arguments to command. If the -l option is supplied, the shell places a dash at
the beginning of the zeroth argument passed to command. This is what login(1) does. The
-c option causes command to be executed with an empty environment. If -a is supplied,
the shell passes name as the zeroth argument to the executed command. If command
cannot be executed for some reason, a non-interactive shell exits, unless the execfail
shell option is enabled. In that case, it returns failure. An interactive shell returns failure if
the file cannot be executed. If command is not specified, any redirections take effect in the
current shell, and the return status is 0. If there is a redirection error, the return status is 1.

So unless I'm missing something, in the example you give, using exec is again redundant.

nohub ...

nohup is a pretty common way of launching a command in a (somewhat) detached subshell. So given your explanation of what you are trying to do, it's use here seems completely reasonable. However...:

setsid ...

setsid runs a process in a new session?! So is essentially does the same job as nohup, albeit slightly differently, so AFAIK is redundant when used with nohup.

Then:

curl -s 'http://localhost/MYJOBHERE_1.PHP' > /dev/null 2>&1

Obviously that's the command you want to run (with no output). Not realy much to add there explicitly, but it's possibly worth noting, that assuming the file exists on the local server (as per the url) rather than triggering it via curl, it would be more efficient to run it directly via PHP from the comandline. E.g something like this:

php -q /var/www/MYJOBHERE_1.PHP 2>&1 >/dev/null

Although note that that will run as root, which may be a problem, depending on what it's doing... If you want to run it as www-data, you can leverage su like this (www-data user doesn't have a shell, so you need to set one):

su - www-data -s /bin/bash -c "php -q /var/www/MYJOBHERE_1.PHP >/dev/null 2>&1"

You'll likely find that the tasks will run much quicker from the commandline like that, rather than via the webserver. Although you may also need to make some adjustments to the cli php.ini file. E.g. if the tasks uses more RAM than PHP cli ini file allows.

And finally:

... &

This is another alternate method of starting a background process, so again I would say it's somewhat redundant. FWIW, with the default settings, whist the way they each deal with output is slightly different, the function between the 2 should be pretty much the same. Although I do note that my googling suggests that there are some scenarios where it may make sense to use the nohup command and a trailing ampersand. But I can't really speak to that...

Then you are stringing your commands together with more single ampersands?! So each of the background bash commands are running as asynchronous background commands...?! TBH, I think all that would make it incredibly hard to track what is going on with each process...

So FWIW, here's how I'd be running the 3 commands in your example (split across multiple lines with '\' for easier readability):

su - www-data \
  -s /bin/bash \
  -c "php -q /var/www/MYJOBHERE_1.PHP & \
      php -q /var/www/MYJOBHERE_2.PHP & \
      php -q /var/www/MYJOBHERE_3.PHP &" \
  2>&1 >/dev/null

Thank you for this and here's more...

Dan - Wed, 2020/03/25 - 18:52

Thank you Jeremy for your detailed response. You never need to apologize for sharing great information as you do.

Let me explain a little more.

I agree that calling PHP directly from the command line would be better... but I am unable to in this instance as the PHP code I am calling does not work when I do. I will look further into why this is the case as I agree it would be much cleaner like that. The code I am using was generated by Scriptcase a RAD tool for database development and I suspect it doesnt run from the command line due to dependencies of some kind.

Since I am in fact using the webserver (apache) to handle these long running tasks... perhaps it is apache itself that is prioritizing and handling them in its own order?

My test environment has 2 PHP scripts that I can call with variables that tell them how long to run so I can test various lengths. If I run that command I listed above about 10 times in rapid succession (press up arrow, press enter, reapeat) then my command prompt is still available to me... so they are in fact backgrounded as I requested... but when I watch the tasks run it takes several minutes for it to run through them all... and they dont run in the order I started them either.

Perhaps, again, its apache handling all these processes as it wants to.

If I can get these .PHP programs to run from the command line, Ill be elminiating apache from the mix so Ill start there.

Thank you again!

Add new comment

form.antibot { display: none !important; } You must have JavaScript enabled to use this form.

Main menu

User menu

You are here

Maybe you know more than me, but on face value, that looks messy

Thank you for this and here's more...

Add new comment

Plain text

Search form

You are here

Maybe you know more than me, but on face value, that looks messy

Thank you for this and here's more...

Add new comment

Plain text