The dreaded "Page not found" error
We've been having an intermittent error, where, after hitting Preview or Save, you get a "Page not found" error (and possibly end up somewhere else, depending on your browser) instead of seeing the post you expected. The error is especially frustrating because if you hit your browser's back button, the original post might not be there.
First, before you press Save, you always do a S
ll and then a Copy what you wrote, right? Because if you Copy your post, you won't lose it! (And if you're really obsessive, keep Notepad on Windows or Stickies on OS X open, and Paste what you copied there.) This is the Internet, and posting fails all the time, so the S
opy habit is a very good one to get into.
That said, here's what's going on with the error:
The site is in a constant request/response cycle of page building. You make a request by clicking a link, and the site responds by constructing the page "at the end of" that link. You make a request by pressing Save, and the site responds by constructing a page that includes what you typed into the Body field. You make a request by typing a URL into the browser's address bar, and the site responds by "going to" the page "at the end of" that URL by building that page.
Absent social engineering, attacks on the site are going to come in the form of requests; that's how to get data from the outside world through the browser and from there to the server. For example, imagine that a URL like "http://www.correntewire.com?=DESTROY ALL DATA!" really did destroy all data! Or imagine a malicious user was able to create a post that whose text was really program code that erased everybody's posts when the Save button was pressed! Clearly, the site must defend itself against attacks like this; without requests, there's no site; but malicious requests could bring the site down.
So the server that hosts Corrente has the digital equivalent of an "immune system" to help it sort malicious requests from legitimate ones. (Account approval is the outermost layer of defense; it's not all that easy to get yourself into a position where you've got the ability to make requests that submit data at Corrente; an SEO weasel got through today, but there are many orders of magnitude more who don't.) In particular, the site checks every post to make sure there's no malicious program code in it. And we want the site to do that! Imagine that today's SEO weasel was really a detractor who could program, for example, instead of some kid from the other side of the world who made a few pennies trying to turn Corrente into a link farm.)
Now, humans don't have any difficulty distinguishing program code from the prose of a post or a comment. However, mechanical rules are all that our servers have to work with, and sometimes the rules get triggered by "false positives."
As in the "Page Not Found" that happened today. (If you make the assumption that your post data was not handcrafted deathless prose but was, in fact, malicious program code meant to destroy the site, you will see that a Page Not Found, combined with blowing away the data, is really the desired response. Slam the door on the intruder, and throw the ticking package out the window!)
So why the "false positives" today? Looking at the logs, there's no simple answer. (That is, there no "black listed" magic words that one should refrain from typing; the rules, though mechanical, are very complex.) However, the rules of the "immune system" -- you might think of them as leucocytes -- operate by pattern recognition in the request data (for example, if there are a lot of pointy brackets, the data is probably HTML; and if there are a lot of squiggly brackets, the data might be in one or another programming languages). And for whatever reason, the false positive data was so complicated -- with respect to the mechanical patterns, and not as prose -- that it overwhelmed the "immune system," which failed before completing its check. When that happens, the post is rejected.
Long story short is that we strengthened the "leucocytes" capacity to process complex rules. Hopefully, that will eliminate most of the false positive, which are frustrating.
If the "Page Not Found" error happens to you, I might need to adjust the rules. So, if when it does, you could let me know the IP address you're at -- http://www.whatismyip.com/ will tell you -- that will let me go into the logs and track down the rule that fired on a false positive. Thanks!
NOTE The problem will be completely solved when there are no more attacks on sites. Until that day, all we can hope to do is mitigate. Since these false positives, IIRC, only started happening relatively recently, I'm guessing they're a consequence of the site upgrade we did. Some setting got banged, or some switch got thrown from on to offf...