Reset connections and long PHP processing...

Discussion of testing theory and practice, including methodologies (such as TDD, BDD, DDD, Agile, XP) and software - anything to do with testing goes here. (Formerly "The Testing Side of Development")

Moderator: General Moderators

Post Reply
Wolf_22
Forum Contributor
Posts: 159
Joined: Fri Dec 26, 2008 9:43 pm

Reset connections and long PHP processing...

Post by Wolf_22 »

I'm currently testing an application that I wrote which performs a long routine that extracts e-mails via IMAP library. (It's an archival app.) What I'm running into with my tests is that after about 2 hours and 15 minutes, I receive disconnect errors in Firefox:
The connection was reset
The connection to the server was reset while the page was loading.
I'm using WAMP on my Windows 7 machine, so I'm not sure if this error is coming from my server or else the server that my script is executing the POST against. There's no revealing errors in either the Apache error logs, the PHP error logs, or even the Windows system logs but the consistent thing that I am noticing is that the process ends almost exactly after 2 hours and 15 minutes.

Before I do anything in the script, I use PHP to change both the memory_limit and time_limit ini values to ensure no resource problems arise and I also have error_reporting set to -1 (to ensure that I see *all* errors). I've not seen a single run time error to mention, and the memory that PHP needs is always around 95x.xx KB before bombing out. (So I'm not sure if this is an issue with the resources I've allocated my WAMP server or else a security facility that terminates the connection from the server my script is extracting e-mails from due to the consistent duration I keep seeing of 2 hours and 15 or so minutes of processing.)

I'm curious: is there any way to trap these "Connection was reset" errors? I'm using the php-imap library found here: https://github.com/barbushin/php-imap. So far, it's done exactly what I've needed but I'm not sure if this is causing any issues or not.

If I could just figure out how to trap this outcome, I could add some logic that could reconnect to the e-mail server and continue extracting e-mails from where it left off... Otherwise, I'll have to do something that allows the user to simply restart the process, first checking for the last e-mail ID that was extracted and then continue extracting from there (which I don't want to have to do--I'd like to make it work continuously without needing manual intervention, etc.)

Any insight into this would be appreciated.
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: Reset connections and long PHP processing...

Post by requinix »

The web is not meant for requests that take a long time to complete. Certainly not something that will take hours. Move this processing to a command-line script and you won't have to worry about this; in fact I'm running a PHP script at work that's been going for two days straight and won't stop until sometime later this week.
Wolf_22
Forum Contributor
Posts: 159
Joined: Fri Dec 26, 2008 9:43 pm

Re: Reset connections and long PHP processing...

Post by Wolf_22 »

The web is not meant for requests that take a long time to complete.


Don't you think that there's a time and place for everything? I understand why you'd say something like that but I'm just not sure that it's true in every situation. Certain circumstances merit certain approaches (granted, what I've done may not be the best approach but I think most of this whole issue is due to using it for the first time and having thousands of e-mails in my inbox [instead of a properly-archived mailbox that may have only a few hundred]; I just think that "playing catch-up" is causing this more than the act of the request). I still think the disconnects, though, are from the IDSs in play or else the app server configurations (either my own or else the remote system that I'm connecting to / making requests to from within my script).
...Certainly not something that will take hours. Move this processing to a command-line script and you won't have to worry about this; in fact I'm running a PHP script at work that's been going for two days straight and won't stop until sometime later this week.
Is that what you're going to do with your week-long script then or are you saying that you've already done this with something like PHP CLI? (I'm not sure how you'd otherwise archive Outlook e-mails [via port 80?] using some sort of command-line script, but feel free to shed some light on this for me... I've never used PHP CLI so I'm not sure of the capabilities it has. Is that what you're referring to?

(Sorry for the confusion but thanks for the follow-up. :) )
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: Reset connections and long PHP processing...

Post by requinix »

Wolf_22 wrote:Don't you think that there's a time and place for everything? I understand why you'd say something like that but I'm just not sure that it's true in every situation.
Maybe I'm misreading you but you waaay over-generalized what I said. All I meant was that you shouldn't use the web for long-running processes because it wasn't built for that. I mean, half the reason AJAX exists is because the old method (a long-running connection that "streams" data) was technically laborious and broke easily. And long-running pages are a common attack vector for DDoSes, exhausting connections on both web and database servers.
Wolf_22 wrote:Certain circumstances merit certain approaches
Yes...
Wolf_22 wrote:(granted, what I've done may not be the best approach
Yes...
Wolf_22 wrote:but I think most of this whole issue is due to using it for the first time and having thousands of e-mails in my inbox [instead of a properly-archived mailbox that may have only a few hundred];
You'd probably hit memory limits first. Then execution time, but I suspect it would have manifested in a way you'd notice immediately.
Wolf_22 wrote:I just think that "playing catch-up" is causing this more than the act of the request).
By "the request" I mean mostly PHP. Not the literal request portion itself where the browser constructs the HTTP request and the server receives it.
Wolf_22 wrote:I still think the disconnects, though, are from the IDSs in play or else the app server configurations (either my own or else the remote system that I'm connecting to / making requests to from within my script).
IDS being... intrusion detection system? Possible but not my first guess. Normally I'd think the operating system and/or your web server settings; if the OS then I'd expect errors, if Apache then, well, I'd still expect errors. If you're using up a lot of memory then it could even be the OS reacting, though I've only seen that happen in Linux and not Windows.

2.5 hours is 150 minutes. That's a very... human number. I wouldn't be surprised to see that number in some configuration settings.
Wolf_22 wrote:Is that what you're going to do with your week-long script then or are you saying that you've already done this with something like PHP CLI?
Yeah: it's entirely command-line. The nature of the script makes it much easier to do from the command line to begin with, but had the reverse been true I still wouldn't have done it over HTTP.
Wolf_22 wrote:(I'm not sure how you'd otherwise archive Outlook e-mails [via port 80?] using some sort of command-line script, but feel free to shed some light on this for me... I've never used PHP CLI so I'm not sure of the capabilities it has. Is that what you're referring to?
CLI PHP, Apache PHP, CGI PHP... those are all basically the same PHPs. This stuff with Outlook? Port 80 is HTTP so I'm not sure what it's doing... Unless you need to interact with the browser, like I don't know Javascript or ActiveX or something, I bet you could do it just fine from the command-line too. And if you don't use $_GET or $_POST then you may be able to just run it from the command-line directly without changes; even if you do, it's pretty easy to get information into a command-line PHP script.

Now I say all this, of course, without knowing exactly what this script is or what the code looks like, but if this 2.5 hour thing is taking away a lot of your time then it might be better invested in converting the script to work from the command line. So at the very least it's a numbers question: is it worth spending the time to fix it, or is it better to leave it as it is even with the problems?
Post Reply