Webstart PHP's picked up by google bot.

not only that, but they are listed first, before the index.html that is in the same location.

at least in my case they are.
http://www.google.com.au/search?q=JMTetris

someone following the first link might get a surprise if they have java, or think that the page is broken if they don’t.

google has a suggestion to put this into the ‘page’:

the only alternative is to edit robots.txt, and i dont have permission. Is there a way to add meta tags to PHP or JNLP? every way i tried resulted in webstart complaining about it.

Ok… after a bit thinking…

Well, you can include that meta tag for bots only. That is… checking the agent via php and if it contains “bot”, you add that meta tag and otherewise you just don’t.

$agent=addslashes($_SERVER[‘HTTP_USER_AGENT’]);

???

sorry, i’m no php programmer. i have no idea what that example means. (that’s why i posted on this board)


<?php header("Content-type: application/x-java-jnlp-file"); ?>
<jnlp spec="1.0+" codebase="http://www.adam.com.au/kellyjones/perm/" href="JMTetris.jnlp">
      <information>
            <title>JMTetris</title>
            <vendor>JuddMan!</vendor>

i thought having just nofollow on my index.html would work but now i think not, as google would consider php to be a web page and keep it as long as it still exists.

by the way, is (href=“JMTetris.jnlp”) correct, or should it be (href=“JMTetris.php”).
the JNLP is on the server next to the php

Something like this:


<?php header("Content-type: application/x-java-jnlp-file");
$agent=addslashes($_SERVER['HTTP_USER_AGENT']);
if(stristr($agent,'bot'))
{
   echo '<HTML><HEAD><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"></HEAD><BODY></BODY></HTML>';
   die;
}
?>
paste your jnlp here

That should give all bots an empty page with that noindex/nofollow meta tag.

Oh I forgot… I would use href=“JMTetris.php”.

And check if you can use “.htaccess”… just create a plain text file with this stuff:


AddType application/x-java-jnlp-file .jnlp

Save it as “.htaccess” and put it into the same directory on the server. If you can use php it’s very likely that you can use htaccess.

no, i already tried .htaccess

i’d say my isp just allows PHP cause they use it for member services and the member pages are hosted on the same server they host their homepage on. they don’t need .htaccess cause they can set content type directly. they would not be at all interested in enabling special features just so joe user can have a game run off their servers, so we have to work with what we’ve got.

thanks for the PHP script. Webstart accepts it, and time will tell if google has noticed.

-judd

I see. So you’ve indirectly paid for that webspace?

I would just send em a friendly mail and ask for it. It’s a piece of cake to add and they are most likely interested in satisfied customers. Heck… even the free webspace I’ve used in the past had that mime type setup correctly (surprisingly ;)).

I accidently had another idea.

You can check the referer and if it contains “google” you redirect the visitor over to your index file. The benefit from this approach is that you won’t have to wait until google sorted that file out of it’s index.

However, some browsers allow you to disable the referer… however most people don’t do that. So at least 80% of the visitors will end up at the correct page :wink:


<?php
//***disable caching***
// Date in the past
header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
// always modified
header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
// HTTP/1.1
header("Cache-Control: no-store, no-cache, must-revalidate");
header("Cache-Control: post-check=0, pre-check=0", false);
// HTTP/1.0
header("Pragma: no-cache");

//***check agent for 'bot'***
$agent=addslashes($_SERVER['HTTP_USER_AGENT']);
if(stristr($agent,'bot'))
{
   echo '<HTML><HEAD><META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"></HEAD><BODY></BODY></HTML>';
   die;
}

//***check referer for 'google'***
$ref=addslashes($_SERVER['HTTP_REFERER']);
if(stristr($ref,'google'))
{
   header("Location: http://www.adam.com.au/kellyjones/perm/");
   die;
}
header("Content-type: application/x-java-jnlp-file");

//***paste your jnlp below the closing tag***
?>

Caching should be disabled because you don’t want to redirect people (again) who come from a “correct” page (everything else than google).